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Throughout the course of history, engineering and mathematics have developed in 
parallel. All branches of engineering depend on mathematics for their description and 
there has been a steady flow of ideas and problems from engineering that has stimulated 
and sometimes initiated branches of mathematics. Thus it is vital that engineering stu- 
dents receive a thorough grounding in mathematics, with the treatment related to their 
interests and problems. As with the previous editions, this has been the motivation for 
the production of this fourth edition — a companion text to the fourth edition of Modern 
Engineering Mathematics, this being designed to provide a first-level core studies 
course in mathematics for undergraduate programmes in all engineering disciplines. 
Building on the foundations laid in the companion text, this book gives an extensive 
treatment of some of the more advanced areas of mathematics that have applications in 
various fields of engineering, particularly as tools for computer-based system model- 
ling, analysis and design. Feedback, from users of the previous editions, on subject 
content has been highly positive indicating that it is sufficiently broad to provide the 
necessary second-level, or optional, studies for most engineering programmes, where 
in each case a selection of the material may be made. Whilst designed primarily for use 
by engineering students, it is believed that the book is also suitable for use by students 
of applied mathematics and the physical sciences. 

Although the pace of the book is at a somewhat more advanced level than the com- 
panion text, the philosophy of learning by doing is retained with continuing emphasis 
on the development of students’ ability to use mathematics with understanding to solve 
engineering problems. Recognizing the increasing importance of mathematical model- 
ling in engineering practice, many of the worked examples and exercises incorporate 
mathematical models that are designed both to provide relevance and to reinforce the 
role of mathematics in various branches of engineering. In addition, each chapter con- 
tains specific sections on engineering applications, and these form an ideal framework 
for individual, or group, study assignments, thereby helping to reinforce the skills of 
mathematical modelling, which are seen as essential if engineers are to tackle the 
increasingly complex systems they are being called upon to analyse and design. The 
importance of numerical methods in problem solving is also recognized, and its treat- 
ment is integrated with the analytical work throughout the book. 

Much of the feedback from users relates to the role and use of software packages, 
particularly symbolic algebra packages. Without making it an essential requirement the 
authors have attempted to highlight throughout the text situations where the user could 
make effective use of software. This also applies to exercises and, indeed, a limited 
number have been introduced for which the use of such a package is essential. Whilst 
any appropriate piece of software can be used, the authors recommend the use of 
MATLAB and/or MAPLE. In this new edition more copious reference to the use of these 
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two packages is made throughout the text, with commands or codes introduced and 
illustrated. When indicated, students are strongly recommended to use these packages 
to check their solutions to exercises. This is not only to help develop proficiency in their 
use, but also to enable students to appreciate the necessity of having a sound knowledge 
of the underpinning mathematics if such packages are to be used effectively. Throughout 
the book two icons are used: 


e Anopen screen indicates that the use of a software package would be useful 
(e.g. for checking solutions) but not essential. 


e A closed screen (81) indicates that the use of a software package is essential or 


highly desirable. 


As indicated earlier, feedback on content from users of previous editions has been 
favourable, and consequently no new chapter has been introduced. However, in 
response to feedback the order of presentation of chapters has been changed, with a 
view to making it more logical and appealing to users. This re-ordering has necessitated 
some redistribution of material both within and across some of the chapters. Another 
new feature is the introduction of the use of colour. It is hoped that this will make the text 
more accessible and student-friendly. Also, in response to feedback individual chapters 
have been reviewed and updated accordingly. The most significant changes are: 


e Chapter | Matrix Analysis: Inclusion of new sections on ‘Singular value decom- 
position’ and ‘Lyapunov stability analysis’. 

e Chapter 5 Laplace transform: Following re-ordering of chapters a more unified 
and extended treatment of transfer functions/transfer matrices for continuous- 
time state-space models has been included. 

e Chapter 6 Z-transforms: Inclusion of a new section on ‘Discretization of 
continuous-time state-space models’. 

e Chapter 8 Fourier transform: Inclusion of a new section on ‘Direct design of 
digital filters and windows’. 

e Chapter 9 Partial differential equations: The treatment of first order equations 
has been extended and a new section on ‘Integral solution’ included. 

e Chapter 10 Optimization: Inclusion of a new section on ‘Least squares’. 


A comprehensive Solutions Manual is available free of charge to lecturers adopting this 
textbook. It will also be available for download via the Web at: www.pearsoned.co.ck/james. 
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2. MATRIX ANALYSIS 


1.1 


у, 


Introduction 


In this chapter we turn our attention again to matrices, first considered in Chapter 5 
of Modern Engineering Mathematics, and their applications in engineering. At the 
outset of the chapter we review the basic results of matrix algebra and briefly introduce 
vector spaces. 

As the reader will be aware, matrices are arrays of real or complex numbers, and have 
a special, but not exclusive, relationship with systems of linear equations. An (incorrect) 
initial impression often formed by users of mathematics is that mathematicians have 
something of an obsession with these systems and their solution. However, such systems 
occur quite naturally in the process of numerical solution of ordinary differential equa- 
tions used to model everyday engineering processes. In Chapter 9 we shall see that they 
also occur in numerical methods for the solution of partial differential equations, for 
example those modelling the flow of a fluid or the transfer of heat. Systems of linear 
first-order differential equations with constant coefficients are at the core of the state- 
space representation of linear system models. Identification, analysis and indeed design 
of such systems can conveniently be performed in the state-space representation, with 
this form assuming a particular importance in the case of multivariable systems. 

In all these areas it is convenient to use a matrix representation for the systems under 
consideration, since this allows the system model to be manipulated following the rules 
of matrix algebra. A particularly valuable type of manipulation is simplification in some 
sense. Such a simplification process is an example of a system transformation, carried 
out by the process of matrix multiplication. At the heart of many transformations are 
the eigenvalues and eigenvectors of a square matrix. In addition to providing the means 
by which simplifying transformations can be deduced, system eigenvalues provide vital 
information on system stability, fundamental frequencies, speed of decay and long-term 
system behaviour. For this reason, we devote a substantial amount of space to the 
process of their calculation, both by hand and by numerical means when necessary. Our 
treatment of numerical methods is intended to be purely indicative rather than complete, 
because a comprehensive matrix algebra computational tool kit, such as MATLAB, is 
now part of the essential armoury of all serious users of mathematics. 

In addition to developing the use of matrix algebra techniques, we also demonstrate 
the techniques and applications of matrix analysis, focusing on the state-space system model 
widely used in control and systems engineering. Here we encounter the idea of a function 
of a matrix, in particular the matrix exponential, and we see again the role of the 
eigenvalues in its calculation. This edition also includes a section on singular value 
decomposition and the pseudo inverse, together with a brief section on Lyapunov stability 
of linear systems using quadratic forms. 


Review of matrix algebra 


This section contains a summary of the definitions and properties associated with matrices 
and determinants. A full account can be found in chapters of Modern Engineering 
Mathematics or elsewhere. It is assumed that readers, prior to embarking on this chapter, 
have a fairly thorough understanding of the material summarized in this section. 


1.2.1 


1.2.2 
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Definitions 


(a) An array of real numbers 


ау а) а Qin 
A- а а» 93 аэ 
Amı Am2 Am3 б Amn 


is called an m x n matrix with m rows and n columns. The a; is referred to as the 
i, jth element and denotes the element in the ith row and jth column. If m = n 
then A is called a square matrix of order n. If the matrix has one column or one 
row then it is called a column vector or a row vector respectively. 


(b) Ina square matrix A of order n the diagonal containing the elements a;i, dy, ..., 
a,, is called the principal or leading diagonal. The sum of the elements in this 
diagonal is called the trace of A, that is 


n 
trace A = У; dj; 


i=l 


(c) A diagonal matrix is a square matrix that has its only non-zero elements along the 
leading diagonal. A special case of a diagonal matrix is the unit or identity matrix / 
for which a,, =a, =...=a,,= 1. 


nn 


(d) A zero or null matrix 0 is a matrix with every element zero. 


(e) The transposed matrix A’ is the matrix A with rows and columns interchanged, 
its i, jth element being a;. 


(f) А square matrix A is called a symmetric matrix if A’ = A. It is called skew 
symmetric if A’ = —A. 
Basic operations on matrices 


In what follows the matrices A, B and C are assumed to have the i, jth elements a 
and c, respectively. 


b; 


ij? 


Equality 


The matrices A and B are equal, that is A = B, if they are of the same order m x n 
and 


& =p 


ў » 1®ї<т,‚, 1=]ў]<=<п 


^ 


Multiplication by a scalar 


If À is a scalar then the matrix AA has elements Aa;. 
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Addition 


We can only add an m x n matrix A to another m x n matrix B and the elements of the 
sum A+ Bare 


a,;+b; 1<i<m, 1s<j<n 


ij 


Properties of addition 

(i) commutative law: A+B=B+A 

(ii) associative law: (A+ B)+ C=A+(B+C) 
(iii) distributive law: МА + В) = ЛА + АВ, Л scalar 


Matrix multiplication 


If A is an m x p matrix and B a p x n matrix then we define the product C= AB as the 
m X n matrix with elements 


р 
су = У аць, i2l,2,...m; jo21,2,...,m 
К=1 


Properties of multiplication 


(i) |The commutative law is not satisfied in general; that is, in general AB + BA. 
Order matters and we distinguish between AB and BA by the terminology: 
pre-multiplication of B by A to form AB and post-multiplication of B by A to 
form BA. 


(ii) Associative law: A(BC)- (AB)C 
(iii) If A is a scalar then 
(AA)B = A(AB) = AAB 
(iv) Distributive law over addition: 
(A+ B)C=AC+ BC 
А(В + С) = АВ + АС 
Note the importance of maintaining order of multiplication. 


(у) If A is an m x n matrix and if /, апа [, are the unit matrices of order m and n 
respectively then 


L,A- AL,- A 


Properties of the transpose 

If AT is the transposed matrix of A then 
© (4+ В) = АТ + ВТ 

(ii) (Ау <А 

Gii) (AB) = B'A" 


1.2.3 
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Determinants 


The determinant of a square n x n matrix A is denoted by det A or | A. 

If we take a determinant and delete row į and column j then the determinant 
remaining is called the minor M; of the i, jth element. In general we can take any row 
i (or column) and evaluate an n x n determinant | A | as 


ГА1= У (1) ам, 
j=l 


A minor multiplied by the appropriate sign is called the cofactor 4; of the i, jth element 
so A, = (-1)” M;, and thus 


|A | = У аА 
j=l 
Some useful properties 
G |А'|=|А| 
(п) |AB|=|A||B| 


(ii) A square matrix A is said to be non-singular if | A| #0 and singular if | A| = 0. 


Adjoint and inverse matrices 


Adjoint matrix 


The adjoint of a square matrix A is the transpose of the matrix of cofactors, so for a 
3 x 3 matrix A 


T 
Ay Ay Ag 

adj A=|A,, An Anz 
Аз Ар Аз 


Properties 

Q)  A(adj A) -|AII 

(ii) |adj A|=|A|"", n being the order of A 
(iii) adj (AB) = (adj B)(adj A) 


Inverse matrix 
Given a square matrix A if we can construct a square matrix B such that 
BA- AB- I 


then we call B the inverse of A and write it as А". 
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Properties 

(i) If Ais non-singular then |A| #0 and A" = (adj A)/|A|. 
(ii) If A is singular then |A| — 0 and A^! does not exist. 
(iii) (ABY' = BOA". 


All the basic matrix operations may be implemented in MATLAB and MAPLE 
using simple commands. In MATLAB a matrix is entered as an array, with row 
elements separated by spaces (or commas) and each row of elements separated by a 
semicolon(;), or the return key to go to a new line. Thus, for example, 

А ГИЛЕ E] 
gives 

EN 
dL 2r 3 
405 
PO 

Having specified the two matrices A and B the operations of addition, subtraction 
and multiplication are implemented using respectively the commands 

С=А+В, С=А-В, С=д*в 

The trace of the matrix A is determined by the command trace (A), and its 
determinant by det (A). 

Multiplication of a matrix A by a scalar is carried out using the command *, while 
raising A to a given power is carried out using the command ^ . Thus, for example, 
3A’ is determined using the command C-3*A^2. 

The transpose of a real matrix A is determined using the apostrophe ’ key; that 
is C=A’ (to accommodate complex matrices the command c=A.’ should be used). 
The inverse of A is determined by C=inv(A). 

For matrices involving algebraic quantities, or when exact arithmetic is desirable 
use of the Symbolic Math Toolbox is required; in which matrices must be expressed 
in symbolic form using the sym command. The command A=sym(A) generates the 
symbolic form of A. For example, for the matrix 


AM 37 Wyss 
А= |12 0.5 33 
3 Ill @ 


the commands 
ewm. 3432 (0595 12 05.5 3.99 5.2. 13b (js 
A-sym(A) 
generate 
As 
O S Б 
ПОБ ТОИ 
О СТТ ЛЕ ИО 
Symbolic manipulation can also be undertaken in MATLAB using the MuPAD 
version of Symbolic Math Toolbox. 
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There are several ways of setting up arrays in MAPLE; the easiest is to use the 
linear algebra package LinearAlgebra so, for example, the commands: 


with (LinearAlgebra) : 
а= е o 0 ЛЬ ПУ a 


return 
4 2 3 
o 
VE N^ 


with the command 


Бе=у/ес ок ЕУ: 


returning 
2 
ig = || 3} 
al 


Having specified two matrices ‘A and B’ addition and subtraction are implemented 

using the commands: 

C:=A+B; and C:-A-B; 
Multiplication of a matrix A by a scalar k is implemented using the command k*a; 
so, for example, (2A + 3B) is implemented by 

2*A+3*B; 
The product AB of two matrices is implemented by either of the following two 
commands: 

MoI; (оле АМЕ оу (А 8) 


(Note: A*B will not work) 
The transpose, trace, determinant, adjoint and inverse of a matrix A are returned 


using, respectively, the commands: 


Transpose (A); 





Trace (A); 
Determinant (A) ; 
Adjoint (A) ; 
MatrixInverse (A) ; 


Linear equations 


In this section we reiterate some definitive statements about the solution of the system 
of simultaneous linear equations 


архі + ар +... +аџ}х, = Б, 


ах + а +... Фах, =, 


OX, * OX» T. AnnXn = b, 
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or, in matrix notation, 


ар ар ... а, || х b, 
dà 0» . Ga |X)|| |5 
аһ an2 77029 Ann Xn b, 
that is, 
Ax-b (1.1) 


where A is the matrix of coefficients and x is the vector of unknowns. If b = 0 the 
equations are called homogeneous, while if b z 0 they are called nonhomogeneous (or 
inhomogeneous). Considering individual cases: 


Case (i) 


If b #0 and |A| # 0 then we have a unique solution x = Ab. 


Case (ii) 


If b = 0 and |A| # 0 we have the trivial solution x = 0. 


Case (iii) 
If b #0 and|A|=0 then we have two possibilities: either the equations are inconsistent 
and we have no solution or we have infinitely many solutions. 


Case (iv) 
If b = 0 and |A| = 0 then we have infinitely many solutions. 
Case (iv) is one of the most important, since from it we can deduce the important 


result that the homogeneous equation Ax = 0 has a non-trivial solution if and only 
if |A|=0. 


Provided that a solution to (1.1) exists it may be determined in MATLAB using the 
command x=A\b. For example, the system of simultaneous equations 
x+y+z=6, x+2y+3z=14, x+4y+9z=36 


may be written in the matrix form 


io oi Pe 6 
1 2 3| y|-|14 
1 4 9iz 36 

A X b 


Entering A and b and using the command x = A\b provides the answer x= 1, y=2,z=3. 


1.2.6 
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In MAPLE the commands 


with (LinearAlgebra) : 
soln:=LinearSolve(A,b); 


will solve the set of linear equations Ax = 5 for the unknown x when A, b given. 
Thus for the above set of equations the commands 


with (LinearAlgebra) : 

Мат (D CT las о ЭД П ТЛ) 
loge wWexeteene ( [L9 С 7 
x:-LinearSolve(A,b); 


return 


Rank of a matrix 


The most commonly used definition of the rank, rank A, of a matrix A is that it is the order 
of the largest square submatrix of A with a non-zero determinant, a square submatrix 
being formed by deleting rows and columns to form a square matrix. Unfortunately it 
is not always easy to compute the rank using this definition and an alternative definition, 
which provides a constructive approach to calculating the rank, is often adopted. First, 
using elementary row operations, the matrix A is reduced to echelon form 


* 


ж 


Ext 





























in which all the entries below the line are zero, and the leading element, marked *, in 
each row above the line is non-zero. The number of non-zero rows in the echelon form 
is equal to rank A. 

When considering the solution of equations (1.1) we saw that provided the determinant 
of the matrix A was not zero we could obtain explicit solutions in terms ofthe inverse matrix. 
However, when we looked at cases with zero determinant the results were much less clear. 
The idea of the rank of a matrix helps to make these results more precise. Defining the 
augmented matrix (A : 5) for (1.1) as the matrix A with the column b added to it then 
we can state the results of cases (111) and (1v) of Section 1.2.5 more clearly as follows: 


If A and (A: Б) have different rank then we have no solution to (1.1). If the two 
matrices have the same rank then a solution exists, and furthermore the solution 
will contain a number of free parameters equal to (n — rank A). 
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(m In MATLAB the rank of the matrix A is generated using the command rank (A). 
For example, if 


А=| 0 


O om N 


the commands 


ЕСШЕ ЛАШ ЕЕ ОЛЕ 
rank (A) 


generate 
ans=2 


In MAPLE the command is also rank (A) . 


Vector spaces 


Vectors and matrices form part of a more extensive formal structure called a vector space. 
The theory of vector spaces underpins many modern approaches to numerical methods 
and the approximate solution of many of the equations that arise in engineering analysis. 
In this section we shall, very briefly, introduce some of the basic ideas of vector spaces 
necessary for later work in this chapter. 


Definition 

A real vector space J is a set of objects called vectors together with rules for addition 
and multiplication by real numbers. For any three vectors a, b and c in V and any real 
numbers o and f) the sum a + b and the product ca also belong to V and satisfy the 
following axioms: 


(a) a+b=b+a 

О) а ае с) (аср) Ее 

(c) there exists a zero vector 0 such that 
a+0=a 

(d) for each a in V there is an element —a in V such that 
a+(-a)=0 

(e) O(a+ b)=aa+ ab 

(0) («+ В)а = аа + Ва 

(g) (оВ)а = о(Ва) 

(h la=a 


1.8.1 


Example 1.1 


Solution 
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It is clear that the real numbers form a vector space. The properties given are also 
satisfied by vectors and by m x n matrices so vectors and matrices also form vector 
spaces. The space of all quadratics a -- bx -- cx? forms a vector space, as can be estab- 
lished by checking the axioms, (a)—(h). Many other common sets of objects also form 
vector spaces. If we can obtain useful information from the general structure then this 
will be of considerable use in specific cases. 


Linear independence 


The idea of linear dependence is a general one for any vector space. The vector x is said 
to be linearly dependent on x,, x,, ..., x,, if it can be written as 


X= OX, + 0х +...+а,х,„ 
for some scalars 04, ... , &,. The set of vectors y,, y; ... , y, is said to be linearly 
independent if and only if 

Ву + Ву + weet mYm = 0 


ітрііеѕ ћаѓ В, = 8, = ...= В, = 0. 
Let us now take a linearly independent set of vectors x4, x;, .. . , x,, in V and con- 
struct a set consisting of all vectors of the form 


X= OX, + 05X t... 0X, 


We shall call this set S(x,, x5, ...,,,). It is clearly a vector space, since all the axioms 
are satisfied. 


Show that 
1 0 
e,=|]0| and e,=/1 
0 0 


form a linearly independent set and describe S(e,, e,) geometrically. 


We have that 
a 
0 = ae, + Be, =| B 
0 


is only satisfied if «= B= 0, and hence e, and e, are linearly independent. 


a 
S(e;, e;) is the set of all vectors of the form | B |, which is just the (x;, x;) 


0 


plane and is a subset of the three-dimensional Euclidean space. 
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1.3.2 


If we can find a set B of linearly independent vectors x,, x5, ..., x, in V such that 
$(x,X,...,x,))-V 
then B is called a basis of the vector space V. Such a basis forms a crucial part of the 
theory, since every vector x in V can be written uniquely as 
X2 0X, * 00 +... + OX, 


The definition of B implies that x must take this form. To establish uniqueness, let us 
assume that we can also write x as 


x= Bx + Bx +... + Dx, 


Then, on subtracting, 


0= (œ = Px +... + x - By, 


and since x,, ... , x, are linearly independent, the only solution is o; 2 Bj, 065 = B»... ; 
hence the two expressions for x are the same. 

It can also be shown that any other basis for V must also contain n vectors and that 
any n + | vectors must be linearly dependent. Such a vector space is said to have 
dimension 7 (or infinite dimension if no finite n can be found). In a three-dimensional 
Euclidean space 


— 
о 


о 
о 
— 


form an obvious basis, and 


жез eS 


4, = » a= ‚ 4 = 


о ьо н 
Ф н н 
— 


is also a perfectly good basis. While the basis can change, the number of vectors in the 
basis, three in this case, 1s an intrinsic property of the vector space. If we consider the 
vector space of quadratics then the sets of functions (1, x, x?) and (1, x — 1, x(x — 1)) 
are both bases for the space, since every quadratic can be written as a + bx + cx’ or as 
A+ B(x — 1) + Cx(x — 1). We note that this space is three-dimensional. 


Transformations between bases 


Since any basis of a particular space contains the same number of vectors, we can look 
at transformations from one basis to another. We shall consider a three-dimensional 
space, but the results are equally valid in any number of dimensions. Let e;, e;,, e; and 
€1, €5, €; be two bases of a space. From the definition of a basis, the vectors e;, e; and ej 
can be written in terms of e,, e; and e, as 


, 
е = ае + азе + азез 
, 
€ = A1222 + A222 + 4323 (1.2) 


# 
ез = 41323 + A2322 + 433€3 
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Taking a typical vector x in V, which can be written both as 
X = хе, + хе; + X36; (1.3) 
and as 
х = хе + де, + де; 
we can use the transformation (1.2) to give 
mer i y , 
х= (апе + 16 + азез) + 0(ае + ае + азез) + х3(а1зе + азе + аззез) 
$ ГА ГА ГА + , f ГА , 
= (мац + дар + аз)е + (аз + ха + 5а3)е + (паз + аз + хзазз)ез 
On comparing with (1.3) we see that 
_ / / / 
Ху = ауд + 40 + ауз 
= / / / 
X2 = 1х + 40 + 453% 
Y / / / 
Хз = азу + 4320 + азу 
or 
х= Ах 


Thus changing from one basis to another is equivalent to transforming the coordinates 
by multiplication by a matrix, and we thus have another interpretation of matrices. 
Successive transformations to a third basis will just give x’ = Bx", and hence the 
composite transformation is x = (AB)x” and is obtained through the standard matrix 
rules. 

For convenience of working it is usual to take mutually orthogonal vectors as a 
basis, so that ете, = 6, and e; e, = бу where 6, is the Kronecker delta 


ij 
ic 1 Е i=j 
0 if izj 


Using (1.2) and multiplying out these orthogonality relations, we have 


‚т, T T 
еі е = > анек > ае = > > аџа,еке = > > аый„ьдь = > амак 
Р р k p k 


k k 


Hence 


У аа = 0; 


k 
or in matrix form 
ДАТА = І 


It should be noted that such a matrix A with A' — A' is called an orthogonal 
matrix. 
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1.3.3 Exercises 


Which of the following sets form a basis for a 
three-dimensional Euclidean space? 


1] fi] fa 1] [1] [3 

(а) 10! 12, 12 (6) |0, 12|, |2 
ol lol {3 1| |3| |5 
1] [1] [2 

(с) 10, 11, 
of lol |0 


Given the unit vectors 


0 0 
е=101, е= |1, ез= |0 
0 0 1 


find the transformation that takes these to the vectors 


1 1 0 
e =} ; ej 2l 4 >, e = |0 
“lo о 1 


Under this, how does the vector 

х= е + хе, + хез transform and what 
is the geometrical interpretation? What 
lines transform into scalar multiples of 
themselves? 


Show that the set of all cubic polynomials 
forms a vector space. Which of the following 
sets of functions are bases of that space? 


(а) {1, х, х2, х) 

(b 11-х, 1+х,1-=2, 1+2) 

(с) {1-х, 1+2, (1-х), 2(1+2х))} 
(d) (x(1 —3), x(1 x3), 1 233, 1x] 


(е) {1+ 2х, 2х + 3x, 3х2 + 40°, А + 1) 


Describe the vector space 
S(x - 2:0, 2x - 30, x + з?) 


What is its dimension? 


1.4 


The eigenvalue problem 


A problem that leads to a concept of crucial importance in many branches of math- 
ematics and its applications is that of seeking non-trivial solutions x # 0 to the matrix 
equation 


Ах= Ах 


This is referred to as the eigenvalue problem; values of the scalar A for which non- 
trivial solutions exist are called eigenvalues and the corresponding solutions x # 0 are 
called the eigenvectors. Such problems arise naturally in many branches of engineering. 
For example, in vibrations the eigenvalues and eigenvectors describe the frequency and 
mode of vibration respectively, while in mechanics they represent principal stresses 
and the principal axes of stress in bodies subjected to external forces. In Section 1.11, 
and later in Section 5.7.1, we shall see that eigenvalues also play an important role in 
the stability analysis of dynamical systems. 

For continuity some of the introductory material on eigenvalues and eigenvectors, 
contained in Chapter 5 of Modern Engineering Mathematics, is first revisited. 


1.4.1 


Example 1.2 


Solution 
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The characteristic equation 


The set of simultaneous equations 


Ах= Ах (1.4) 
where A is an n x n matrix and x 2 [x, x, ... x,'isannx 1 column vector can 
be written in the form 

(A1 - Ax 20 (1.5) 


where I is the identity matrix. The matrix equation (1.5) represents simply a set of 
homogeneous equations, and we know that a non-trivial solution exists if 


c(A) = |Al- A] =0 (1.6) 


Here c(A) is the expansion of the determinant and is a polynomial of degree n in A, 
called the characteristic polynomial of A. Thus 


c(A) x A ate GA + CA"? SE esee cA + Co 


and the equation c(A) = 0 is called the characteristic equation of A. We note that this 
equation can be obtained just as well by evaluating |A — A/| = 0; however, the form 
(1.6) is preferred for the definition of the characteristic equation, since the coefficient 
of A” is then always +1. 

In many areas of engineering, particularly in those involving vibration or the control 
of processes, the determination of those values of A for which (1.5) has a non-trivial 
solution (that is, a solution for which x # 0) is of vital importance. These values of 
A are precisely the values that satisfy the characteristic equation, and are called the 
eigenvalues of A. 


Find the characteristic equation for the matrix 


1 1 -2 
A=]|-1 2 1 
0 1 -1 


By (1.6), the characteristic equation for A is the cubic equation 
А-1 -1 2 
с(Д)=| 1 4-2 -1/=0 
0 -1 А+1 
Expanding the determinant along the first column gives 
А-2 -1 -1 2 
A) =(A-1 - 

Е ) -1 d E А+1 

= (4- )КА – 2)(4А+ 0) – 11 - I2 (А+ 1)] 
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Example 1.3 


Solution 


Thus 
c(A) à-22-A4220 


is the required characteristic equation. 


For matrices of large order, determining the characteristic polynomial by direct 
expansion of |A/— A| is unsatisfactory in view of the large number of terms involved 
in the determinant expansion. Alternative procedures are available to reduce the amount 
of calculation, and that due to Faddeev may be stated as follows. 

The method of Faddeev 
If the characteristic polynomial of an n x n matrix A is written as 
A" -p 4" as =й —Pn 


then the coefficients p,, pj, .. . , p, can be computed using 
1 = 
р, = -пасеА, (7 = 1, 2,..., п) 
» 


where 


r 


_ [А (к= 1) 
M |AB,,. (r22,3,...,n) 


and 
B,- A,-p,, where lis the n x n identity matrix 
The calculations may be checked using the result that 


B,=A,-p,! must be the zero matrix 


Using the method of Faddeev, obtain the characteristic equation of the matrix A of 
Example 1.2. 


1 -2 
A--1 2 1 
0 -1 


Let the characteristic equation be 


c(A) = A =p% = рА = рз 


1.4.2 
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Then, following the procedure described above, 


Pp, =traceA=(1+2-1)=2 


-1 1-2 
B,-A-212)-1 0 1 
01-3 

—2 -1 5 
A-AB,-|-1 0 1 
er 2d d 


р» = strace A, = $(-2+0+4)=1 


~ 2 


-3 -1 5 
В,= А,-1=|-1 -1 1 
zy 248 
2 0 0 
A,=AB,=| 0 2 0 
0 0 -2 


pj7 itraceA; 2 1(-2-2-2) 2 2 
Then, the characteristic polynomial of A is 
c4) 22-22 -A-*2 


in agreement with the result of Example 1.2. In this case, however, a check may be 
carried out on the computation, since 


B,- A, «21-0 


as required. 


Eigenvalues and eigenvectors 


The roots of the characteristic equation (1.6) are called the eigenvalues of the matrix A 
(the terms latent roots, proper roots and characteristic roots are also sometimes used). 
By the Fundamental Theorem of Algebra, a polynomial equation of degree n has 
exactly n roots, so that the matrix A has exactly n eigenvalues A; i — 1, 2, ... , n. These 
eigenvalues may be real or complex, and not necessarily distinct. Corresponding to each 
eigenvalue À, there is a non-zero solution x = e, of (1.5); e, is called the eigenvector of 
A corresponding to the eigenvalue À;. (Again the terms latent vector, proper vector and 
characteristic vector are sometimes seen, but are generally obsolete.) We note that if 
x — e, satisfies (1.5) then any scalar multiple fJ;e; of e; also satisfies (1.5), so that the 
eigenvector e; may only be determined to within a scalar multiple. 
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Example 1.4 . Determine the eigenvalues and eigenvectors for the matrix A of Example 1.2. 


Solution 11 2 
A--1 2 1 
0 1 -1 


The eigenvalues A; of A satisfy the characteristic equation c(A) = 0, and this has been 
obtained in Examples 1.2 and 1.3 as the cubic 


№-22- 4+2=0 


which can be solved to obtain the eigenvalues A,, A, and A. 

Alternatively, it may be possible, using the determinant form |A/ — Al, or indeed (as 
we often do when seeking the eigenvalues) the form |A — AI|, by carrying out suitable 
row and/or column operations to factorize the determinant. 

In this case 


|A-Al|=| -1 2-A 1 
0 1 -—1-А 


and adding column 1 to column 3 gives 


ї=& d cie l-4 1 1 
-1 2-A⁄ 0 [|[=-(1+%)|-1 2-4 0 
0 i ped 0 1 


Subtracting row 3 from row 1 gives 
1-3 0 0 


—(1+0)|\ —-1_ 2-34 0|=—(1+4)(1—-4)(2-Л) 
0 1 1 


Setting | A — AI | = 0 gives the eigenvalues as A, = 2, A, = 1 and å, = —1. The order in 
which they are written is arbitrary, but for consistency we shall adopt the convention of 
taking Aj, À;, .. . , A, in decreasing order. 

Having obtained the eigenvalues A; (i — 1, 2, 3), the corresponding eigenvectors e; 
are obtained by solving the appropriate homogeneous equations 


(A - Ae; - 0 (1.7) 


When i= 1, A; = A, = 2 and (1.7) is 


-1 1 —2 ey 
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that is, 
-eQ t e5-2e5-0 
—e u 0e, e4-20 
0е + е, = 3ез = 0 
leading to the solution 


Cu- -wg 
a ee M 


-1 3 -1 


where f, is an arbitrary non-zero scalar. Thus the eigenvector e; corresponding to the 
eigenvalue A, = 2 is 


e-pB[1 3 1]" 


As a check, we can compute 


1 1 -2/!|1 2 1 
Ае, = —1 2 1 3 = В, 6 = 2B, 3 = Де 
0 1 -1|! 2 1 


and thus conclude that our calculation was correct. 
When i 2 2, А, = А, = 1 апа we have to solve 


0 1 -2|le5 
-1 1. I|fen/=0 
0 1 -2 €23 


that is, 
0е,; + е — 2e, = 0 
-en tent ez=0 
0e + е››— 2е›;, = 0 
leading to the solution 


ea -e e 
where p, is an arbitrary scalar. Thus the eigenvector e, corresponding to the eigenvalue 
А = 11 
е= 13 2 1 
Again a check could be made by computing Ae,. 
Finally, when і = 3, А, = А; = –1 and we obtain from (1.7) 
2 1 —2||езх 
-] 3 1||е›|=0 
0 1 0 | | €33 
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that is, 
2e; + ез – 2езз = 0 
ез + 3ез + ез = 0 
Оез, + ез + 0е;; = 0 


and hence 


-1 0 


e31 _ €32 _ €33 _ 
UM =h 


Here again f) is an arbitrary scalar, and the eigenvector e, corresponding to the eigen- 
value A, is 


е= ВП 0 1] 


The calculation can be checked as before. Thus we have found that the eigenvalues of 
the matrix A are 2, 1 and —1, with corresponding eigenvectors 


ВИ з 1, A&B 2 1f ad 6p 0 1f 


respectively. 


Since in Example 1.4 the D, i = 1, 2, 3, are arbitrary, it follows that there are an 
infinite number of eigenvectors, scalar multiples of each other, corresponding to each 
eigenvalue. Sometimes it is convenient to scale the eigenvectors according to some 
convention. A convention frequently adopted is to normalize the eigenvectors so that 
they are uniquely determined up to a scale factor of +1. The normalized form of an 


eigenvectore=[e, e; ... e,]'is denoted by é and is given by 
gue. 
lel 
where 
lel 2 (ei ei... e) 


For example, for the matrix A of Example 1.4, the normalized forms ofthe eigenvectors 
are 


&c[ S 1/11], é-[MJ14 2/14 1/14] 
and 
&-[2 0 IJ 


However, throughout the text, unless otherwise stated, the eigenvectors will always 
be presented in their ‘simplest’ form, so that for the matrix of Example 1.4 we take 
В, = В, = B; = 1 and write 


e-[1 3 1], e-[3 2 Il" and e=[1 0 1f 
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For a n x n matrix A the MATLAB command p-poly (A) generates an n + 1 ele- 
ment row vector whose elements are the coefficients of the characteristic polynomial 
of A, the coefficients being ordered in descending powers. The eigenvalues of A 
are the roots of the polynomial and are generated using the command roots (p). 
The command 


[M, S] zeig (A) 


generates the normalized eigenvectors of A as the columns of the matrix M and its 
corresponding eigenvalues as the diagonal elements of the diagonal matrix S 
(M and S are called respectively the modal and spectral matrices of A and we shall 
return to discuss them in more detail in Section 1.6.1). In the absence of the left- 
hand arguments, the command eig (A) by itself simply generates the eigenvalues 
of A. 


For the matrix A of Example 1.4 the commands 
ОТЕ O E 
[M,S]=e1g (A) 

generate the output 


Op sls =O) stiles ЫП 
MeN SOS Ossi 0 Ole 
ООБА =O 26073 O. 70H 


2.0000 0 0 
Se 1.0000 0 
0 0 dL a (000/00) 


These concur with our calculated answers, with f, = 0.3015, B, = —0.2673 and 
(5 =O, 707/11. 

Using the Symbolic Math Toolbox in MATLAB we saw earlier that the matrix A 
may be converted from numeric into symbolic form using the command A=sym (A). 
Then its symbolic eigenvalues and eigenvectors are generated using the sequence of 
commands 


АШ 1 =-2; -1 2 l1; 0 1 =i]; 
A=sym (A) ; 
[M, S] zeig(A) 


as 


= Л Л] 


20 2 0 
ЕС 
S-[1, 0, 0] 
0 2 00 
[@, @, =i 





In MAPLE the command Eigenvalues (A); returns a vector of eigenvalues. The 
command Eigenvectors (A) returns both a vector of eigenvalues as before and 
a matrix containing the eigenvalues, so that the ith column is an eigenvector 
corresponding to the eigenvalue in the ith entry of the preceding vector. Thus the 
commands: 
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Example 1.5 


Solution 


with(LinearAlgebra), 
[oS Tta LI РТО АЕА CO TE BI 
Eigenvalues (A); 





return 


2 
L-1 


and the command 





Eigenvectors (A); 





returns 
ШЕ? TESTES 
-1 SENE? 
L 1 i di 


Find the eigenvalues and eigenvectors of 


A= cosÜ -sinO 
sin@  cos@ 
Now 
M-A|- |* coe sin 0 | 
—sin 0 À — cos 0 


= 12 – 24с050 + соѕ20 + ѕіп20 = 4? -24cos0 1 
So the eigenvalues are the roots of 
Д – 24с050+1=0 
that is, 
A=cos@+ jsind 
Solving for the eigenvectors as in Example 1.4, we obtain 
e-[1 -jl and e-[1 jl 


In Example 1.5 we see that eigenvalues can be complex numbers, and that the eigen- 
vectors may have complex components. This situation arises when the characteristic 
equation has complex (conjugate) roots. 


1.4 THE EIGENVALUE PROBLEM 23 


1.4.3 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


characteristic polynomials of the matrices 


3 2 1 
(а) |4 5 -l 
23 4 


a) { | 
jd 


1.4.4 











Using the method of Faddeev, obtain the 1 0 4l 1 12 
0105 4 (01,022 
2-1 12 |l-4 | -1 13 
0 1] 1 0 
(b) - + - 
=] 1 1 1 5 0 6 1 -1 0 
1 110 (0|0 11 6 (1 2 
. с ; 6 6 -2 -2 -1 
Find the eigenvalues and corresponding = = = 
eigenvectors of the matrices г 4 = 
1 1 1 -4 -2 
1 2 2 54 (0 3 1 
(b) 
3 2 |[-1 -1 0) 1 2 4 
Repeated eigenvalues 
In the examples considered so far the eigenvalues A, (i 2 1, 2, .. .) of the matrix A have 


Example 1.6 


Solution 


been distinct, and in such cases the corresponding eigenvectors can be found and are 
linearly independent. The matrix A is then said to have a full set of linearly independent 
eigenvectors. It is clear that the roots of the characteristic polynomial c(A) may not all 
be distinct; and when c(A) has p < n distinct roots, c(A) may be factorized as 


eA e(A-A) (A43 ^ (494) * 
indicating that the root A=/;,i=1,2,..., p, is a root of order m, where the integer m, 
is called the algebraic multiplicity of the eigenvalue A;. Clearly m, ^ m; t... т, = п. 
When a matrix A has repeated eigenvalues, the question arises as to whether it is 


possible to obtain a full set of linearly independent eigenvectors for A. We first consider 
two examples to illustrate the situation. 


Determine the eigenvalues and corresponding eigenvectors of the matrix 


3 -3 2 
А = | –1 5 -2 
-1 3 0 


We find the eigenvalues from 


3-A -3 2 
-1 5-A -2|=0 
-1 з -A 


as À = 4, à, = À; =2. 
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ag 


The eigenvectors are obtained from 
(A - Ae; 2 0 (1.8) 
and when A = А, — 4, we obtain from (1.8) 
e=[1 -1 -1f 
When A= A, =A; = 2, (1.8) becomes 
1 -3 2|] ez 
-] 3 -2]||e»|-0 
—1 3 —2||е» 
so that the corresponding eigenvector is obtained from the single equation 
€4, —3e5 * 2064-0 (1.9) 


Clearly we are free to choose any two of the components e;,, e;; or ej, at will, with the 
remaining one determined by (1.9). Suppose we set ej 7 o and ej, — D; then (1.9) means 
that ej, 2 3o — 2D, and thus 


3 —2 
e,=Ba-28B a Bl'=a}1|+B) 0 (1.10) 
0 1 


Now A = 2 is an eigenvalue of multiplicity 2, and we seek, if possible, two linearly 
independent eigenvectors defined by (1.10). Setting ~= 1 and B= 0 yields 


е,=[3 1 0]! 
and setting œ = 0 and B = 1 gives a second vector 
e,=[-2 0 1f 


These two vectors are linearly independent and of the form defined by (1.10), and it is 
clear that many other choices are possible. However, any other choices of the form (1.10) 
will be linear combinations of e; and e, as chosen above. For example, e—- [1 1 1] 
satisfies (1.10), but e — e; * e. 

In this example, although there was a repeated eigenvalue of algebraic multiplicity 2, 
it was possible to construct two linearly independent eigenvectors corresponding to this 
eigenvalue. Thus the matrix A has three and only three linearly independent eigenvectors. 


The MATLAB commands 
у= б н к = TE ORIS 
[M, S] zeig (A) 
generate 
@ = Л ЛАК Бу MN CO MITES] S 
==) Б Лу ШШ ЕЙ ОЕ 5 
xoc a ОБА (PIGES OIN 
4.0000 0 0 


= 0 2.0000 0 
0 0 2.0000 


Example 1.7 


Solution 
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Clearly the first column of M (corresponding to the eigenvalue A, = 4) is a scalar 
multiple of e, The second and third columns of M (corresponding to the repeated 
eigenvalue A, = A, = 2) are not scalar multiples of e, and e. However, both satisfy 
(1.10) and are equally acceptable as a pair of linearly independent eigenvectors 
corresponding to the repeated eigenvalue. It is left as an exercise to show that both 
are linear combinations of e, and e}. 

Check that in symbolic form the commands 


A=sym (A); 
[M,S]=e1g (A) 
generate 

= = 3 =2)) 
О 
к О ЕН 

S=[4, 0, 0] 
[0 S НО 
ОНО ЕШ? 


In MAPLE the command Eigenvectors(A); produces corresponding results. 
Thus the commands 





with(LinearAlgebra): 
Е Mertens MTM EP EAM Бо И ЕТ БЕ 
Eigenvectors (A); 





return 





Determine the eigenvalues and corresponding eigenvectors for the matrix 


12 2 
A=| 0 2 1 
e ou 2 


Solving |A — Al| = 0 gives the eigenvalues as A, = A, = 2, A; = 1. The eigenvector 
corresponding to the non-repeated or simple eigenvalue A, = 1 is easily found as 


e-[ 1 ~ 
When A= A, = A, = 2, the corresponding eigenvector is given by 
(А – 21)е, = 0 
that is, as the solution of 
—ey t 2655 * 264 20 (i) 
ез= 0 (i) 
ceu O5 =0 (iii) 
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Example 1.8 


Solution 


From (ii) we have e,, = 0, and from (1) and (ii) it follows that e,, = 2e,,. We deduce 
that there is only one linearly independent eigenvector corresponding to the repeated 
eigenvalue A = 2, namely 


e,=[2 1 oF 


and in this case the matrix A does not possess a full set of linearly independent 
eigenvectors. 


We see from Examples 1.6 and 1.7 that if an n x n matrix A has repeated eigen- 
values then a full set of n linearly independent eigenvectors may or may not exist. 
The number of linearly independent eigenvectors associated with a repeated eigen- 
value A, of algebraic multiplicity m, is given by the nullity q; of the matrix A — A4, 
where 


q;2n-rank(A-A/), with 1<q,;<m; (1.11) 
qi is sometimes referred to as the degeneracy of the matrix A — /,/ or the geometric 


multiplicity of the eigenvalue A; since it determines the dimension of the space 
spanned by the corresponding eigenvector(s) e;. 


Confirm the findings of Examples 1.6 and 1.7 concerning the number of linearly 
independent eigenvectors found. 


In Example 1.6, we had an eigenvalue 1, = 2 of algebraic multiplicity 2. Correspondingly, 


3-2. -—3 2 1 -3 2 
A-A,l=| -1 5-2 -2|z2|-1 3 -2 
-1 3. =2 -1 3 -2 


and performing the row operation of adding row 1 to rows 2 and 3 yields 


-3 2 


Adding 3 times column 1 to column 2 followed by subtracting 2 times column 1 from 
column 3 gives finally 


1 0 0 
0 0 0 
0 0 0 


indicating a rank of 1. Then from (1.11) the nullity g, = 3 — 1 = 2, confirming that 
corresponding to the eigenvalue A = 2 there are two linearly independent eigenvectors, 
as found in Example 1.6. 
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In Example 1.7 we again had a repeated eigenvalue A, = 2 of algebraic multiplicity 2. 
Then 


1-2 2 2 -1 2 2 
A-21- 0 2-2 1 |=| 0 0 1 
=l 2 2—2 -1 2 0 


Performing row and column operations as before produces the matrix 


о о н 
о о OG 
oro 


this time indicating a rank of 2. From (1.11) the nullity g, = 3 — 2 = 1, confirming that 
there is one and only one linearly independent eigenvector associated with this eigen- 
value, as found in Example 1.7. 


1.4.5 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Obtain the eigenvalues and corresponding 


using the concept of rank, determine how 


eigenvectors of the matrices many linearly independent eigenvectors 


» uw 
(a) |1 3 1 
y 32 

4 6 

() | 1 3 

el =5 


correspond to this value of A. Determine a 


eigenvalue of the matrix 


04 =2 её corresponding set of linearly independent 
(b) |-1 1 2 eigenvectors. 
-1 -1 2 
9 Given that 2 = 1 is a twice-repeated eigenvalue 
6 UE S$ of the matrix 
2 ( |3 0 -2 
-2 6 -2 -3 2 =1 
A=|-1 0 1 
Given that A = 1 is a three-times repeated 1 2 2 
-7 25 how many linearly independent eigenvectors 


-3 


1.4.6 


correspond to this value of A? Determine a 
corresponding set of linearly independent 
2 2 eigenvectors. 


Some useful properties of eigenvalues 


The following basic properties of the eigenvalues A,, A,,..., 4, of ann x n matrix A 
are sometimes useful. The results are readily proved either from the definition of eigen- 
values as the values of A satisfying (1.4), or by comparison of corresponding charac- 
teristic polynomials (1.6). Consequently, the proofs are left to Exercise 10. 
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Property 1.1 


The sum of the eigenvalues of A is 


n n 
Y A, = trace A = ` T 
у=} {=l 


Property 1.2 
The product of the eigenvalues of A is 


П A; = detA 
js 


where detA denotes the determinant of the matrix A. 


Property 1.3 


The eigenvalues of the inverse matrix A !, provided it exists, are 


Property 1.4 


The eigenvalues of the transposed matrix A’ are 
А, A>, ET Ды 


as for the matrix A. 


Property 1.5 


If k is a scalar then the eigenvalues of КА аге 


РРА 


п 


Property 1.6 


If k is a scalar and / the n x n identity (unit) matrix then the eigenvalues of A + kl 
are respectively 


Ath ЛЬ .., AQtk 


n 


1.4.7 


Example 1.9 


Solution 
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Property 1.7 


If k is a positive integer then the eigenvalues of A‘ are 


k k k 
Die E EAS 


Symmetric matrices 


A square matrix A is said to be symmetric if AT = A. Such matrices form an important 
class and arise in a variety of practical situations. Two important results concerning the 
eigenvalues and eigenvectors of such matrices are 


(a) the eigenvalues of a real symmetric matrix are real; 


(b) for an n X n real symmetric matrix it is always possible to find n linearly 
independent eigenvectors €j, e,,..., e, that are mutually orthogonal so 
that ee, = 0 for i z j. 


If the orthogonal eigenvectors of a symmetric matrix are normalized as 


then the inner (scalar) product is 
676, (Lj-1,2,...,n) 


where 6, is the Kronecker delta defined in Section 1.3.2. 
The set of normalized eigenvectors of a symmetric matrix therefore forms an ortho- 
normal set (that is, it forms a mutually orthogonal normalized set of vectors). 


Obtain the eigenvalues and corresponding orthogonal eigenvectors of the symmetric 
matrix 


dD 
|| 
о мо м 
Ф uv м 


0 
0 
3 
and show that the normalized eigenvectors form an orthonormal set. 


The eigenvalues of A are A, = 6, A, = 3 and A; = 1, with corresponding eigenvectors 
est 2 05 esp 0 1% e= 1 of 

which in normalized form are 
é=[1 2 os, 6=[0 0 17, .é&-[2 1 O[/s 

Evaluating the inner products, we see that, for example, 


aT a al a 
éé,=i+i+0=1, é,é,=-2+2+0=0 
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and that 


aTa 
6:6, - б, 


(i,j = 1, 2, 3) 


confirming that the eigenvectors form an orthonormal set. 


1.4.8 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


10 Verify Properties 1.1—1.7 of Section 1.4.6. 12 Determine the eigenvalues and corresponding 


11 Given that the eigenvalues of the matrix 


eigenvectors of the symmetric matrix 


-3. =3 =3 
4 А= 1-3 1 -1 
A-|2 5 4 sso. d 
=[! = 0 and verify that the eigenvectors are mutually 
orthogonal. 
are 5, 3 апа 1: 
Пе, The 3 x 3 symmetric matrix A has eigenvalues 6, 
(a) confirm Properties 1.1—1.4 of Section 3 and 2. The eigenvectors corresponding to 
1.4.6; the eigenvalues 6 and 3 are[1 1 2]! and 
(b) taking k = 2, confirm Properties 1.5—1.7 of [| 1 -lI]lrespectively. Find an eigenvector 


Section 1.4.6. 


1.5.1 


corresponding to the eigenvalue 2. 


Numerical methods 


In practice we may well be dealing with matrices whose elements are decimal numbers 
or with matrices of high orders. In order to determine the eigenvalues and eigenvectors 
of such matrices, it is necessary that we have numerical algorithms at our disposal. 


The power method 


Consider a matrix A having n distinct eigenvalues 1,, A,,..., A, and corresponding 
n linearly independent eigenvectors e, €», .. . , e,. Taking this set of vectors as the 
basis, we can write any vector x 2 [x x; ... X,]' as a linear combination in the 
form 


n 
Xx-04,0, t 0565... 0,6, — £ ое, 
і=1 


Then, since Ae, = А,е; Ёогі = 1, 2,..., п, 


п п 
Ах = А У ae; = у Qiie; 
i=1 {е1 
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and, for any positive integer k, 
A'x = > «Ае, 
і=1 


or 
k k д AM 
Ах= № ое + >, о, (z) e; (1.12) 


Assuming that the eigenvalues are ordered such that 
101214212... 2 14,] 
and that o z 0, we have from (1.12) 


lim A'x - A1one, (1.13) 
ke» 
since all the other terms inside the square brackets tend to zero. The eigenvalue A, and 
its corresponding eigenvector e, are referred to as the dominant eigenvalue and eigen- 
vector respectively. The other eigenvalues and eigenvectors are called subdominant. 
Thus if we introduce the iterative process 


x" -Ax9 (k=0,1,2,...) 


starting with some arbitrary vector x? not orthogonal to e;, it follows from (1.13) 
that 


x = AKO 


will converge to the dominant eigenvector of A. 

A clear disadvantage with this scheme is that if |A] is large then A'x? will become 
very large, and computer overflow can occur. This can be avoided by scaling the vector 
x” after each iteration. The standard approach is to make the largest element of x 
unity using the scaling factor max(x), which represents the element of x having the 
largest modulus. 

Thus in practice we adopt the iterative process 


yD = Ax® 


Т (6+1) 
x c.l m0, 1:27) (1.14) 
(k+1) 
max(y  ) 
and itis common to take x ? 2 [1 1 ... IJ]. 


Corresponding to (1.12), we have 


n À k 
(k) R k i 
+ | — - 
x А Oe; >. a ) е; 


where 


К = [max(y™)max(y®) . . . max(y ^)! 
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Figure 1.1 Outline 
pseudocode program 
for power method to 
calculate the maximum 
eigenvalue. 


Example 1.10 


Solution 


Again we see that x converges to a multiple of the dominant eigenvector e,. Also, 
since Ax — Ax, we have y? — A,x'9, and since the largest element of x is unity, 
it follows that the scaling factors max( y?) converge to the dominant eigenvalue A;. 
The rate of convergence depends primarily on the ratios 

%| |А А, 
Ail [A Ai 
The smaller these ratios, the faster the rate of convergence. The iterative process repres- 


ents the simplest form of the power method, and a pseudocode for the basic algorithm 
is given in Figure 1.1. 


Й # з жй. 




















tread inei IRE |! 
т < 0 
гереаї 
mold —— m 
{evaluate y = Ax} 
{find m = max(y;) } 
(x! 2 [y/m y/m . . . y,/m]) 
until abs(m — mold) < tolerance 
{write (results) } 


Use the power method to find the dominant eigenvalue and the corresponding eigen- 
vector of the matrix 


1 1 -2 
A=|-1 2 1 
0 1 -1 


Taking x? 2 [1 1  1]' in (1.14), we have 


11 —2|[|1| |0 0 
у= Ах? =|-1 2 1||1|=|2|=2|1|; 4P =2 
0 1 -1||1| [0 0 
0 
xy ey" m 
0 
i * eX [1 0.5 
y^ Ae em up 2 1||1|=|2|=2|1 |; А?=2 
0 1 -1[|0| |1 0.5 
1 
2 
e 
1 
2 


Example 1.11 


Solution 
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1 1 -2||i 5 0.25 
у?=Ах?=|-1 2 1||1|=|2|=2|1 |; А?=2 
01 -—1||; i 0.25 

0.25 
xe 
0.25 


Continuing with the process, we have 
39 22[0375 1 0.375] 
y?-22[0.312 1 0312] 

y? —2[0.344 1 0.344] 
у? =2[0.328 1 10328] 
y® = 2[0.336 1 0.336] 


Clearly y® is approaching the vector 2[} 1 } J”, so that the dominant eigenvalue is 
2 and the corresponding eigenvector is [5 1 i , Which conforms to the answer 
obtained in Example 1.4. 


Find the dominant eigenvalue of 


prog. =1 
A- 0 1 1 

=ї 1 2 1 

0 0 1 -1 


Starting with x® =[1 1 1 1]', the iterations give the following: 








Iteration k 1 2 3 4 5 6 7 
Eigenvalue — 3 2.6667 3.3750 3.0741 3.2048 3.1636 3.1642 
x? 1 0 -0.3750 -0.4074 -0.4578 -0.4549 -0.4621  —0.4621 
x? 1 0.6667 0.6250 0.4815 0.4819 0.4624 0.4621 0.4621 
x? 1 1 1 1 1 1 1 1 
x 1 0 0.3750 0.1852 0.2651 0.2293 0.2403 0.2401 


This indicates that the dominant eigenvalue is aproximately 3.16, with corresponding 
eigenvector [-0.46 0.46 1 024]. 
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Example 1.12 


Solution 


The power method is suitable for obtaining the dominant eigenvalue and cor- 
responding eigenvector of a matrix A having real distinct eigenvalues. The smallest 
eigenvalue, provided it is non-zero, can be obtained by using the same method on the 
inverse matrix A! when it exists. This follows since if Ax = Ax then A !x 2 A!x. To 
find the subdominant eigenvalue using this method the dominant eigenvalue must first 
be removed from the matrix using deflation methods. We shall illustrate such a method 
for symmetric matrices only. 

Let A be a symmetric matrix having real eigenvalues A,, A5, . . . , À,. Then, by result 
(b) of Section 1.4.7, it has n corresponding mutually orthogonal normalized eigen- 
vectors 6j, 6, .. . , 6, such that 


éle = 8; (Lj=1,2,...,n) 
Let A, be the dominant eigenvalue and consider the matrix 
А, = А-– 11 
which is such that 
Ajé, - (A- Ae ee, - Aé, - лё ётё) = Лё, = Лё, =0 
Аё, = Аг, = A é(616)) = А.ё, 


^ AS ^f(ATAN ^ 
Аё, = Аё, – Аё (ётё) = Аё, 


A A araTan A 
Аё, V Aé, m A;é (616) V A, 


Thus the matrix A, has the same eigenvalues and eigenvectors as the matrix A, except 
that the eigenvalue corresponding to A, is now zero. The power method can then be 
applied to the matrix A, to obtain the subdominant eigenvalue A, and its corresponding 
eigenvector e;. By repeated use of this technique, we can determine all the eigenvalues 
and corresponding eigenvectors of A. 


Given that the symmetric matrix 


2 
А=|2 
0 


© л м 


0 
0 
3 
has a dominant eigenvalue A, = 6 with corresponding normalized eigenvector é, = 


[1 2  O]'/\5, find the subdominant eigenvalue 2, and corresponding eigenvector é. 


Following the above procedure, 


A =A- éé! 
272 9| |1 £24 
6 
=|2 5 0|-$2|1 2 0]-]- i 
0.0 3 0 0 0 3 
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Applying the power method procedure (1.14), with x? 2 [1 1 1]. gives 





2 2 
5 15 
y” = Ax? - -l =3 -i ; АР = 3 
| 3 1 
& 0.133 
Хх” =|—| = | —0.133 
Ed 1 
[2] 2 
15 45 
у? =А,х'?”=)—|=3|-5|; М?=3 
3) 1 
А 0.044 
(2) 2 
= |-2 |= | –0.044 
1 1 
| 2 
135 
у” = Ах? B -2 =3 -2 ; А? = 3 
|3] 1 
0.015 
9) = 10.015 
1 





Clearly the subdominant eigenvalue of A is A, — 3, and a few more iterations confirm 
the corresponding normalized eigenvector as é,- [O O0  1]*. This is confirmed by the 
solution of Example 1.9. Note that the third eigenvalue may then be obtained using 
Property 1.1 of Section 1.4.6, since 


trace A- 102 À4,* 4,142634 À4 


giving 2, = 1. Alternatively, A, and é, can be obtained by applying the power method 
to the matrix A, = A, — 1,é,é5. 


Although it is good as an illustration of the principles underlying iterative methods 
for evaluating eigenvalues and eigenvectors, the power method is of little practical im- 
portance, except possibly when dealing with large sparse matrices. In order to evaluate 
all the eigenvalues and eigenvectors of a matrix, including those with repeated eigen- 
values, more sophisticated methods are required. Many of the numerical methods avail- 
able, such as the Jacobi and Householder methods, are only applicable to symmetric 
matrices, and involve reducing the matrix to a special form so that its eigenvalues can 
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1.5.2 


Theorem 1.1 


Theorem 1.2 


be readily calculated. Analogous methods for non-symmetric matrices are the LR and 
QR methods. It is methods such as these, together with others based on the inverse 
iterative method, that form the basis of the algorithms that exist in modern software 
packages such as MATLAB. Such methods will not be pursued further here, and the 
interested reader is referred to specialist texts on numerical analysis. 


Gerschgorin circles 


In many engineering applications it is not necessary to obtain accurate approximations 
to the eigenvalues of a matrix. All that is often required are bounds on the eigenvalues. 
For example, when considering the stability of continuous- or discrete-time systems 
(see Sections 5.7—6.8), we are concerned as to whether the eigenvalues lie in the 
negative half-plane or within the unit circle in the complex plane. (Note that the eigen- 
values of a non-symmetric matrix can be complex.) The Gerschgorin theorems often 
provide a quick method to answer such questions without the need for detailed calcula- 
tions. These theorems may be stated as follows. 


First Gerschgorin theorem 
Every eigenvalue of the matrix A = [a,], of order n, lies inside at least one of the 
circles (called Gerschgorin circles) in the complex plane with centre a; and radii 
r= X aj G— 1, 2, ... , n). Expressed in another form, all the eigenvalues of the 
matrix A = [a,] lie in the union of the discs 
|z- a;l S r; = > la;l (1= 1, 2,..., и) 

j=l 

j*i 
in the complex z plane. 


end of theorem 


Second Gerschgorin theorem 


If the union of s of the Gerschgorin circles forms a connected region isolated from the 
remaining circles then exactly s of the eigenvalues lie within this region. 


end of theorem 


Since the disc |z — aj| « r; 1s contained within the disc 
n 
Iz| S la,|+7r;= X [M 
j=l 


centred at the origin, we have a less precise but more easily applied criterion that all the 
eigenvalues of the matrix A lie within the disc 


jz| = mad $ lasl] (i=1,2,..., n) (1.15) 


j=l 


centred at the origin. 
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Example 1.13 


Solution 


Figure 1.2 
Gerschgorin circles 
for the matrix A of 
Example 1.13. 


The spectral radius p(A) of a matrix A is the modulus of its dominant eigenvalue; 
that is, 


p(A) = тах} (= 1,2,..., п) (1.16) 


where A, A,,...,4,, are the eigenvalues of A. Geometrically, p (A) is the radius of the 
smallest circle centred at the origin in the complex plane such that all the eigenvalues 
of A lie inside the circle. It follows from (1.15) that 


р(А) = max| jai] (i21,2,...,n) (1.17) 


j=l 


Draw the Gerschgorin circles corresponding to the matrix 
10 -1 
A=|-1 2 
0 2 


What can be concluded about the eigenvalues of A? 


The three Gerschgorin circles are 
0) |z-10|2]|-1|] 4 021 
(пй) |z-2|2|-1|]*12]23 
(ш) |#—3|=|2|=2 
and are illustrated in Figure 1.2. 
It follows from Theorem 1.2 that one eigenvalue lies within the circle centred (10, 0) 
of radius 1, and two eigenvalues lie within the union of the other two circles; that 1s, 


within the circle centred at (2, 0) of radius 3. Since the matrix A is symmetric, it follows 
from result (a) of Section 1.4.7 that the eigenvalues are real. Hence 


9<A,< 11 


-1 < {A,,A;} <5 






23 4° 5 6 7 8 9«I0-1I I2 


z plane 
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14 


i5 


16 


i 


1.5.3 Exercises 


Use the power method to estimate the dominant 
eigenvalue and its corresponding eigenvector for 
the matrix 


А= 


N U A 


3 
5 
2 


— N N 
2 
Co 


Stop when you consider the eigenvalue estimate is 
correct to two decimal places. 


Repeat Exercise 14 for the matrices 


2 1 0 3 0 1 
(а)зА=|1 2 1 (b) АЄ= 2 2 2 
1 1 2 4 2 5 
2 -1 0 0 
(с) А= -1 2 -1 0 
-1 2 -1 
0 oO -1 2 
The symmetric matrix 
3 1 
А= 1 3 1 


has dominant eigenvector e,;=[1 1 2]. 
Obtain the matrix 


A =A- éé! 


where A, is the eigenvalue corresponding to the 
eigenvector e,. Using the deflation method, obtain 
the subdominant eigenvalue A; and corresponding 
eigenvector e, correct to two decimal places, taking 
[1 1 1]'asa first approximation to e,. Continue 
the process to obtain the third eigenvalue 4, and its 
corresponding eigenvector ez. 


Draw the Gerschgorin circles corresponding to 
the matrix 


> L =l 
A=| 1 0 1 
-1 1 =5 


and hence show that the three eigenvalues are 
such that 


3«A,«7, -2<A,<2, —1«À,«-3 


Show that the characteristic equation of the 
matrix 


10 -1 0 
A=|-1 2 2 
0 2 3 


of Example 1.13 is 
ЛА) = № – 150 + 514- 17=0 
Using the Newton-Raphson iterative procedure 


f) 
Ала = А, В 
Fs) 


determine the eigenvalue identified in 
Example 1.13 to lie in the interval 9< A< 11, 
correct to three decimal places. 

Using Properties 1.1 and 1.2 of Section 1.4.6, 
determine the other two eigenvalues of A to the 
same approximation. 





(a) If the eigenvalues of the n x n matrix A are 
Aic s Aes sd, 60 


show that the eigenvalue À, can be found by 
applying the power method to the matrix &/— A, 
where / is the identity matrix and k > A,. 


(b) By considering the Gerschgorin circles, show 
that the eigenvalues of the matrix 


2 -1 0 
А = | –1 2- =l 
0 -1 2 


satisfy the inequality 
0xAxA4 


Hence, using the result proved in (a), determine 
the smallest modulus eigenvalue of A correct to 
two decimal places. 


1.6.1 
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Reduction to canonical form 


In this section we examine the process of reduction of a matrix to canonical form. 
Specifically, we examine methods by which certain square matrices can be reduced or 
transformed into diagonal form. The process of transformation can be thought of as a 
change of system coordinates, with the new coordinate axes chosen in such a way that 
the system can be expressed in a simple form. The simplification may, for example, be 
a transformation to principal axes or a decoupling of system equations. 

We will see that not all matrices can be reduced to diagonal form. In some cases we 
can only achieve the so-called Jordan canonical form, but many of the advantages of 
the diagonal form can be extended to this case as well. 

The transformation to diagonal form is just one example of a similarity transform. 
Other such transforms exist, but, in common with the transformation to diagonal form, 
their purpose is usually that of simplifying the system model in some way. 


Reduction to diagonal form 


For an n x n matrix А possessing a full set of n linearly independent eigenvectors 
€,, €), ... , €, We can write down a modal matrix M having the n eigenvectors as its 
columns: 


M-[e, e e ... e] 


The diagonal matrix having the eigenvalues of A as its diagonal elements is called 
the spectral matrix corresponding to the modal matrix M of A, often denoted by A. 
That is, 


with the ijth element being given by 4;0;, where ó; is the Kronecker delta and i, j — 1, 
2,...,n. It is important in the work that follows that the pair of matrices M and A 
are written down correctly. If the ith column of M is the eigenvector e; then the 
element in the (i, i) position in A must be å, the eigenvalue corresponding to the 


eigenvector e;. 


We saw in Section 1.4.2 that in MATLAB the command 
[M, S] zeig (A) 


generates the modal and spectral matrices for the matrix A (Note: For convenience 
S is used to represent A when using MATLAB; whilst both are produced by the 
command Eigenvalues (A) in MAPLE.) 
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Example 1.14 


Solution 


Obtain a modal matrix and the corresponding spectral matrix for the matrix A of 
Example 1.4. 


1 1 -2 
A=|-1 2 1 
0 1 -I1 


having eigenvalues 2, = 2, 2, = 1 and A, = —1, with corresponding eigenvectors 
e-[ 3 1 esp 2 1j, e=f1 0 1] 


Choosing as modal matrix M=[e, e, e,]' gives 


1 
М = |3 
1 


— NY WwW 


1 
0 
1 
The corresponding spectral matrix is 


0 
0 


A= 


о о м 
о н о 


Returning to the general case, if we premultiply the matrix M by A, we obtain 


AM-A[e, e ... e]-[Ae Ae ... Ae] 
-[AÀe А6 ... А,е,] 
so that 
АМ = MA (1.18) 
Since the n eigenvectors e;, ej, ... , e, are linearly independent, the matrix M is non- 


singular, so that M7 exists. Thus premultiplying by M™ gives 
M'AM- M'MA- A (1.19) 


indicating that the similarity transformation M'AM reduces the matrix A to the diag- 
onal or canonical form A. Thus a matrix A possessing a full set of linearly independent 
eigenvectors is reducible to diagonal form, and the reduction process is often referred 
to as the diagonalization of the matrix A. Since 


А = МАМ! (1.20) 


it follows that A is uniquely determined once the eigenvalues and corresponding eigen- 
vectors are known. Note that knowledge of the eigenvalues and eigenvectors alone is 
not sufficient: in order to structure M and A correctly, the association of eigenvalues 
and the corresponding eigenvectors must also be known. 


Example 1.15 


Solution 
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Verify results (1.19) and (1.20) for the matrix A of Example 1.14. 


Since 
1 3 -2 2 2 
M-|3 2 0| wehave М" =: 3 0 —3 
1 1 -] -2 7 
Taking 
2.0 0 
А= 10 1 0 
00-1 


matrix multiplication confirms the results 


M^AM-A, A- MAM" 


For an n x n symmetric matrix A it follows, from result (b) of Section 1.4.7, that 


to the n real eigenvalues Ai, A45, .. . , À, there correspond n linearly independent 
normalized eigenvectors é,, é,..., é, that are mutually orthogonal so that 
£d -0; (19512, 7 


The corresponding modal matrix 
M-[é ё ... ё] 
is then such that 


Е [ê e е" ё, | 


aT al a al a al a 
é, ёё ёё, ... ёё, 
P a aT al a al a al a 
la i - e _|&ё& ёё ... ёё, 
aT al a al a al a 
€, еле €, €) €, €, 

0 ... 0 

0 ... O 

= = 1 
0 0... 1 





That is, M M= I and so M = M~. Thus Ñ is an orthogonal matrix (the term ortho- 
normal matrix would be more appropriate, but the nomenclature is long established). 

It follows from (1.19) that a symmetric matrix A can be reduced to diagonal form A 
using the orthogonal transformation 


ЙАМ = Л (1.21) 
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Example 1.16 


Solution 


1.6.2 


For the symmetric matrix A considered in Example 1.9 write down the corresponding 
orthogonal modal matrix M and show that li AM — A, where A is the spectral matrix. 


From Example 1.9 the eigenvalues are A, = 6, A, = 3 and A; = 1, with corresponding 
normalized eigenvectors 


é-[ 2 OfA45, &-[0 0 1], &=[-2 1 0[/у5 


The corresponding modal matrix is 


and, by matrix multiplication, 


M'AM = 


© © © 
о о о 
- о о 
|| 
> 


The Jordan canonical form 


If an n x n matrix A does not possess a full set of linearly independent eigenvectors 
then it cannot be reduced to diagonal form using the similarity transformation M AM. 
In such a case, however, it is possible to reduce A to a Jordan canonical form, making 
use of ‘generalized’ eigenvectors. 

As indicated in (1.11), if a matrix A has an eigenvalue A; of algebraic multiplicity 
m; and geometric multiplicity q, with 1: 5 q; & m; then there are q; linearly independent 
eigenvectors corresponding to A;. Consequently, we need to generate m; — q; generalized 
eigenvectors in order to produce a full set. To obtain these, we first obtain the q; linearly 
independent eigenvectors by solving 


(A-—A,le,=90 


Then for each of these vectors we try to construct a generalized eigenvector e* such 
that 


(A - A/D e? - e; 


If the resulting vector e* is linearly independent of all the eigenvectors (and generalized 
eigenvectors) already found then it is a valid additional generalized eigenvector. If 
further generalized eigenvectors corresponding to A; are needed, we then repeat the 
process using 


(A-Ae#* = eF 


and so on until sufficient vectors are found. 


Example 1.17 


Solution 
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Obtain a generalized eigenvector corresponding to the eigenvalue A = 2 of Example 1.7. 
For 


1 
А= 0 
-1 


N N N 
N- N 


we found in Example 1.7 that corresponding to the eigenvalue 1, = 2 there was only 
one linearly independent eigenvector 


e-[2 1 Of 


and we need to find a generalized eigenvector to produce a full set. To obtain the general- 
ized eigenvector e¥, we solve 


(A - 2L)et- e, 
that is, we solve 
-1 2 2||ef 2 
0. 0 1 её = |1 
-1 2 0 || ех 0 


At once, we have e#, = | and e# = 2e%, and so 
е*= [2 1 1 


Thus, by including generalized eigenvectors, we have a full set of eigenvectors for the 
matrix A given by 


e=[2 1 0f e=[2 1 1, e=[1 -1 17 


If we include the generalized eigenvectors, it is always possible to obtain for 
an n x n matrix A a modal matrix M with n linearly independent columns e, ez, 
..., €,. Corresponding to (1.18), we have 


AM - MJ 
where J is called the Jordan form of A. Premultiplying by M™ then gives 


М-'АМ = Ј (1.22) 


The process of reducing A to J is known as the reduction of A to its Jordan normal, or 


canonical, form. 
If A has p distinct eigenvalues then the matrix J is of the block-diagonal form 


J=[J, Jp ... J] 
where each submatrix J, (i= 1, 2,..., p) is associated with the corresponding eigen- 
value A,. The submatrix J, will have 4; as its leading diagonal elements, with zeros 


elsewhere except on the diagonal above the leading diagonal. On this diagonal the 
entries will have the value 1 or 0, depending on the number of generalized eigenvectors 
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Example 1.18 


Solution 


used and how they were generated. To illustrate this, suppose that A is a 7 x 7 matrix 
with eigenvalues A, = 1, A, = 2 (occurring twice), 2, = 3 (occurring four times), and 
suppose that the number of linearly independent eigenvectors generated in each case is 


A,=1, 1 eigenvector 
A,=2, 1 eigenvector 
A,=3, 2 eigenvectors 


with one further generalized eigenvector having been determined for A, = 2 and two 
more for A, = 3. 

Corresponding to A, = 1, the Jordan block J, will be just [1], while that corresponding 
to A, = 2 will be 


gos 
0 2 


Corresponding to 1, = 3, the Jordan block J; can take one of the two forms 


à 1050 à, 15:0 0 
0 A 1 |0 0 A30 0 
J3,=|0 0 A | 0 Or uh. | 699 RR 
e ee 0 ' AÀ 1 
оо о» EE 


depending on how the generalized eigenvectors are generated. Corresponding to A, = 3, 
we had two linearly independent eigenvectors e;, and e;,. If both generalized eigen- 
vectors are generated from one of these vectors then J; will take the form J}, whereas 
if one generalized eigenvector has been generated from each eigenvector then J, will 
take the form J,;. 


Obtain the Jordan canonical form of the matrix A of Example 1.17, and show that 
MAM = J where Mis a modal matrix that includes generalized eigenvectors. 


For 
1 2 32 
A=|0 2 1 
-] 2 2 


from Example 1.17 we know that the eigenvalues of A are A, = 2 (twice) and A, = 1. 
The eigenvector corresponding to 2, = 1 has been determined as e, = [1 1 —1]" in 
Example 1.7 and corresponding to A, = 2 we found one linearly independent eigen- 
vectore,=[2 1 O]' anda generalized eigenvector e*=[2 1 1]. Тһиѕ ће тойа! 
matrix including this generalized eigenvector is 


2 2 1 
М= |1 1 1 
0 1 -1 
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and the corresponding Jordan canonical form is 


To check this result, we compute M™ as 


7 =] 
M'--1 2 1 
=i. 3 0 


and, forming MAM, we obtain J as expected. 


In MATLAB the command J=jordan(A) computes the Jordan form of A; including 
the case when J is diagonal and all the eigenvectors of A are linearly independent. 
The command 


[M,J]=jordan (A) 


also computes the similarity transformation or modal matrix M that may include 
generalized eigenvectors. 

Numerical calculation of the Jordan form is very sensitive to round-off errors, etc. 
This makes it very difficult to compute the Jordan form reliably and almost any 
change in A causes it to be diagonal. 

For the matrix A in Example 1.18 the sequence of commands 

И оо) ЕЛ АЛЕ NN] S 
[M,J]=jordan (A) 


returns 
=i =2 2 
Mei =i, i 
ib ОИЕ. 
3L 0 0 
ex d) 2 d 
0 0 2 


which is equally acceptable to the solution given in Example 1.18. (This can be 
checked by evaluating M ' AM.) 
Using the Symbolic Math Toolbox in MATLAB the sequence of commands 
IAE EE NE" METRE ME dier 
AS=sym A 
[M, J] =jordan (AS) 
returns the same output as above. In practice, this sequence of commands is only 


really effective when the elements of the matrix A are integers or ratios of small 
integers. 
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20 


ell 


22 


23 


1.6(3 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Show that the eigenvalues of the matrix 24 
-1 6 -12 
А= 0 -13 30 
0 -9 20 


are 5, 2 and -1. Obtain the corresponding 
eigenvectors. Write down the modal matrix 
M and spectral matrix A. Evaluate M! and 
show that M'AM = A. 


Using the eigenvalues and corresponding 25 
eigenvectors of the symmetric matrix 
2 2 0 
A=|2 5 0 
0 0 3 
26 


obtained in Example 1.9, verify that 
MAM = A where M and A are respectively 
a normalized modal matrix and a spectral 


matrix of A. 
Given 
5 10 8 
А = |10 2 -2 
8 -2 11 


find its eigenvalues and corresponding 

eigenvectors. Normalize the eigenvectors 

and write down the corresponding normalized Оў, 
modal matrix M. Write down M^ and show 

that MAM = A, where A is the spectral 

matrix of A. 


Determine the eigenvalues and corresponding 
eigenvectors of the matrix 


1 1 -2 
A=|-1 2 1 
01-1 


Write down the modal matrix M and spectral 
matrix A. Confirm that MAM = A and that 
A = МАМ”. 


Determine the eigenvalues and corresponding 
eigenvectors of the symmetric matrix 


3 -2 4 
A=|-2 -2 6 
4 6 =l 


Verify that the eigenvectors are orthogonal, 
and write down an orthogonal matrix L such that 
L'AL - A, where A is the spectral matrix of A. 


A 3 x 3 symmetric matrix A has eigenvalues 
6, 3 and 1. The eigenvectors corresponding 

to the eigenvalues 6 and 1 are [1 2 0] апа 
[-2 1 0J respectively. Find the eigenvector 
corresponding to the eigenvalue 3, and hence 
determine the matrix A. 


Given that A = 1 is a thrice-repeated eigenvalue 
of the matrix 


-3 -7 -5 
А=|2 4 3 
I1 2 2 


use the nullity, given by (1.11), of a suitable matrix 
to show that there is only one corresponding linearly 
independent eigenvector. Obtain two further 
generalized eigenvectors, and write down the 
corresponding modal matrix M. Confirm that M'AM 
= J, where d is the appropriate Jordan matrix. 


Show that the eigenvalues of the matrix 


1 0 0 -3 

a-l 0 1-3 0 
-0.5 -3 1 05 
-3 0 0 1 


are —2, —2, 4 and 4. Using the nullity, given 

by (1.11), of appropriate matrices, show that 
there are two linearly independent eigenvectors 
corresponding to the repeated eigenvalue —2 
and only one corresponding to the repeated 
eigenvalue 4. Obtain a further generalized 
eigenvector corresponding to the eigenvalue 4. 
Write down the Jordan canonical form of A. 


1.6.4 


Example 1.19 


Solution 
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Quadratic forms 


A quadratic form in n independent variables x,, x, ..., x, 1s a homogeneous second- 
degree polynomial of the form 


n n 
V(x, Хә...» Xn) E x У aij Xj Xj 
і=1 j-1 
= 2 
= Ay x4 F 15X 1X» Pacet Oy XX, 


+ арх + ах +... + ахх, 


2 
а, 8,05 (1.23) 
Defining the vector x 2 [xi х, ... x,]' and the matrix 
а ар ... Qin 
A- а “2 An 
а аһ2 Ann 


the quadratic form (1.23) may be written in the form 
V(x) 2 x'Ax (1.24) 


The matrix A is referred to as the matrix of the quadratic form and the determinant of 
A is called the discriminant of the quadratic form. 

Now a; and a; in (1.23) are both coefficients of the term x,x, (i 2 j), so that fori z j 
the coefficient of the term x,x, is a;  a;. By defining new coefficients a; and a; for x,x; 
and xx, respectively, such that a; = a;— ‚(а + а„), the matrix A associated with the 
quadratic form V(x) may be taken to be symmetric. Thus for real quadratic forms we 
can, without loss of generality, consider the matrix A to be a symmetric matrix. 


Find the real symmetric matrix corresponding to the quadratic form 


V(x,, x4, X3) =X} + 3x2 — 4х5 – Зхух, + 2015; – 5х; 


Ifx2[x x, xj.we have 


2 
1 A 5 Х\ 
= 3 5 а 
V(xy, x3, 035) = [51 х x3] = 3 —i||x|-2x Ax 
2 5 _ 
2 5 4 Хз 
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Example 1.20 


Solution 


In Section 1.6.1 we saw that a real symmetric matrix A can always be reduced to the 
diagonal form 


M AÑ= A 


where M is the normalized orthogonal modal matrix of A and A is its spectral matrix. 
Thus for a real quadratic form we can specify a change of variables 


x= My 
where y=[y, yə ... ¥,]', Such that 
И = хТАх = у ЙТАЙу = уТЛу 
giving 
V = Ayi t Ayat... + Any, (1.25) 


Hence the quadratic form x' Ax may be reduced to the sum of squares by the trans- 
formation x = M y, where M is the normalized modal matrix of A. The resulting form 
given in (1.25) is called the canonical form of the quadratic form V given in (1.24). 
The reduction of a quadratic form to its canonical form has many applications in 
engineering, particularly in stress analysis. 


Find the canonical form of the quadratic form 
= 2х{ + 5х5 + 3х2 + 4х\х, 


Can V take negative values for any values of x,, x; and x;? 


At once, we have 


2 2 0 
V-2x'2 5 Qix-x' Ax 
0.0 3 


where 


x=[x, x» xs)’, A= 


© кю м 
C t^ м 
о о © 


The real symmetric matrix A is the matrix of Example 1.16, where we found the 
normalized orthogonal modal matrix M and spectral matrix A to be 


440 -2 6 0 0 
M=|2/! 0 LJ, A=|0 3 0 
1 0 00 1 


Example 1.21 
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such that MAM = A. Thus, setting x = My, we obtain 


6 0 0 
V=y'M'AMy=y'|0 3 Oly = 6y 4395495 
001 


as the required canonical form. 

Clearly V is non-negative for all y,, y, and y;. Since x = М y and Mis an orthogonal 
matrix it follows that y = Mx, so for all x there is a corresponding y. It follows that V 
cannot take negative values for any values of x,, x, and x;. 


The quadratic form of Example 1.20 was seen to be non-negative for any vector x, 
and is positive provided that x # 0. Such a quadratic form x' Ax is called a positive- 
definite quadratic form, and, by reducing to canonical form, we have seen that this 
property depends only on the eigenvalues of the real symmetric matrix A. This leads us 
to classify quadratic forms V = х' Ах, where x = [x, x, ... x,]’ in the following 
manner. 


(a) Vis positive-definite, that is V > 0 for all vectors x except x = 0, if and only 
if all the eigenvalues of A are positive. 


(b) Vis positive-semidefinite, that is V => 0 for all vectors x and V = 0 for at least 
one vector x # 0, if and only if all the eigenvalues of A are non-negative and 
at least one of the eigenvalues is zero. 

(c) Vis negative-definite if —V is positive-definite, with a corresponding condition 
on the eigenvalues of —A. 

(d) Vis negative-semidefinite if—V is positive-semidefinite, with a corresponding 
condition on the eigenvalues of —A. 


(c) FV is indefinite, that is V takes at least one positive value and at least one 
negative value, if and only if the matrix A has both positive and negative 
eigenvalues. 


Since the classification of a real quadratic form x' Ax depends entirely on the location 
of the eigenvalues of the symmetric matrix A, it may be viewed as a property of A itself. 
For this reason, it is common to talk of positive-definite, positive-semidefinite, and so 
on, symmetric matrices without reference to the underlying quadratic form. 


Classify the following quadratic forms: 

(a) 3x7 + 2x3 + 3x} -— 2хух,— 2хух; 

(b) 7x7 + xi + x$- 4x,x, — 4x,x, 8xjx; 

(c) —3xi- 5x$ — 3x$ * 2xyx, * 2xyx, — 2x,x; 


(d) 4х2+х2+ 15x} - 4x,x, 
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Solution 


(a) The matrix corresponding to the quadratic form is 


3 -1 0 
A-|-1 2 -l 
0 -1 3 


The eigenvalues of A are 4, 3 and 1, so the quadratic form is positive-definite. 


(b) The matrix corresponding to the quadratic form is 


7 <2 2 
A=|-2 1 4 
- 4 d 


The eigenvalues of A are 9, 3 and —3, so the quadratic form is indefinite. 


(c) The matrix corresponding to the quadratic form is 


-3 1 -1 
А=| 1 —5 1 
-1 1 -3 


The eigenvalues of A are —6, —3 and —2, so the quadratic form is negative-definite. 


(d) The matrix corresponding to the quadratic form is 


4 -2 0 
A-|-2 1 0 
0 0 15 


The eigenvalues of A are 15, 5 and 0, so the quadratic form is positive- 
semidefinite. 


In Example 1.21 classifying the quadratic forms involved determining the eigen- 
values of A. If A contains one or more parameters then the task becomes difficult, if not 
impossible, even with the use of a symbolic algebra computer package. Frequently in 
engineering, particularly in stability analysis, it is necessary to determine the range of 
values of a parameter k, say, for which a quadratic form remains definite or at least 
semidefinite in sign. J. J. Sylvester determined criteria for the classification of quadratic 
forms (or the associated real symmetric matrix) that do not require the computation of 
the eigenvalues. These criteria are known as Sylvester’s conditions, which we shall 
briefly discuss without proof. 

In order to classify the quadratic form x" Ax Sylvester’s conditions involve considera- 
tion of the principal minors of A. A principal minor P, of order i (i2 1,2, ... , n) of 
ап Xn square matrix A is the determinant of the submatrix, of order i, whose principal 
diagonal is part of the principal diagonal of A. Note that when i = n the principal minor 
is det A. In particular, the leading principal minors of A are 


ау а а, 
3 D, — d2 5» 0937 |5. ug D, = det A 


аз ар аз 


ау 01 


Рр, =1а1]|, Р = 








ü а» 


Example 1.22 


Solution 
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Determine all the principal minors of the matrix 


1 k 0 
А= сь 0 
0.0 5 
and indicate which are the leading principal minors. 


(a) The principal minor of order three is 

Р, = деї А = 5(2— К?) (leading principal minor D;) 
(b) The principal minors of order two are 

(i) deleting row 1 and column 1, 
2 0 
Py = h^ = 10 
(ii) deleting row 2 and column 2, 


1 0 








Py = =5 
22 0 5 
(iii) deleting row 3 and column 3, 
А к 
Py = ho =2-k (leading principal minor D;) 


(c) The principal minors of order one are 
(i) . deleting rows 1 and 2 and columns 1 and 2, 
Рү=|5|=5 
(п) deleting rows 1 and 3 and columns 1 and 3, 
Pp52]|2|22 
(ii) deleting rows 2 and 3 and columns 2 and 3, 


P4-|l|21 (leading principal minor Dj) 


Sylvester's conditions: These state that the quadratic form x" Ax, where A is an 
n X n real symmetric matrix, is 


(a) positive-definite if and only if all the leading principal minors of A are 
positivessthatess2 920/1700 |. 

(b) nmegative-definite if and only if the leading principal minors of A alternate in 
sign with a, « 0; that is, (C1)D; 7 0 (i2 1,2, ..., n; 

(c) positive-semidefinite if and only if det A — 0 and all the principal minors of 
A are non-negative; that is, det A = 0 and P, = 0 for all principal minors; 


(d) megative-semidefinite if and only if det A — 0 and (—1)'P, 7 0 for all principal 
minors. 
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Example 1.23 . For what values of k is the matrix A of Example 1.22 positive-definite? 


Solution The leading principal minors of A are 


D,21, р,=2-0, Р, = 502-0) 


These will be positive provided that 2 — k* > 0, so the matrix will be positive-definite 
provided that k? < 2, that is —/2 < k < 42. 


Example 1.24 Using Sylvester’s conditions, confirm the conclusions of Example 1.21. 


Solution (a) 


(b) 


(c) 


The matrix of the quadratic form is 


3 -1 0 
A-|-1 2 -l 
0 -1 3 


and its leading principal minors are 


3 А 


3 | =5, detA- 12 
-1 2 


Thus, by Sylvester’s condition (a), the quadratic form is positive-definite. 


The matrix of the quadratic form is 


y uA uud 
A=|-2 1 4 
2 4 1 


and its leading principal minors are 


7 | 


T. | =3, detA=-81 
-2 1 


Thus none of Sylvester's conditions can be satisfied, and the quadratic form is 
indefinite. 


The matrix of the quadratic form is 


-3 1 -1 
A-|1 -5 1 
-1 ] -3 


and its leading principal minors are 
-3 1] | 


= 14, det A=-36 
Її 5 


-3, | 


Thus, by Sylvester’s condition (b), the quadratic form is negative-definite. 


28 


29 


30 


31 
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(d) The matrix of the quadratic form is 


4 -2 0 
A-|-2 1 0 
0 0 15 


and its leading principal minors are 


4 -2 
4, 
3 1 


=o, det A=0 


We therefore need to evaluate all the principal minors to see if the quadratic form 
is positive-semidefinite. The principal minors are 


4: <9 


4, 1, 15, 
m 1 


i^ | 


0 
15 


4 0 


=15, 
| А 15 


= 0, det A=0 


Thus, by Sylvester’s condition (c), the quadratic form is positive-semidefinite. 


1.6.5 Exercises 


Reduce the quadratic form 
2х? + 5х2 + 2х5 + 4х + 205 + Ау 


to the sum of squares by an orthogonal 
transformation. 


Classify the quadratic forms 
(а) х2+ 2х2 + 7xj- 2xyx, * Axyx, — 2x 


(b) x7 + 2x3 + 5x} — 2хух + 4хуху — 2х›х» 





(c) x7 + 2x3 + 4x3 — 2хух, + 4хүху — 2х›х, 





(a) Show that ax? — 2bxyx; ^ cx? is positive-definite 
if and only if a > 0 and ac > b’. 

(b) Find inequalities that must be satisfied by a and 
b to ensure that 2x} + ax} + 3х3 — 2x,x, + 2bx x; 
is positive-definite. 


Evaluate the definiteness of the matrix 


2 1 -#1 
A=| 1 2 1 
-1 1 2 


32 


SS 


34 


(a) by obtaining the eigenvalues; 
(b) by evaluating the principal minors. 


Determine the exact range of k for which the 
quadratic form 

O(x, y, Z) = k(x? + y?) + 2xy +27 + 2xz - 2yz 
is positive-definite in x, y and z. What can be said 


about the definiteness of Q when k = 2? 


Determine the minimum value of the constant 
a such that the quadratic form 


3+a 1 1 
x| 1 а 2\х 
1 2 a 


wherex-[x, x, xj]! is positive-definite. 


Express the quadratic form 
О = х? + 4хух, – 4х, – бох, + А(х2 + х2) 


in the form x'Ax, wherex=[x, x, х] апа 
A is a symmetric matrix. Hence determine 

the range of values of A for which Q is 
positive-definite. 
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Functions of a matrix 


Example 1.25 


Solution 


Let A be an n x n constant square matrix, so that 
A’ = AA, A’ = AA’ = AVA, and so on 


are all defined. We can then define a function f(A) of the matrix A using a power series 
representation. For example, 


f(A) = УВА" = 31+ В.А +... + В,А? (1.26) 


where we have interpreted A’ as the n x n identity matrix I. 


Given the 2 x 2 square matrix 


«L3 


2 
determine f(A) = УВА" when By = 1, 8, =—1 and Б, = 3. 


r=0 


Now 


ЦІ Jf EE 4 
0 


„3. ж 
22 19 


Note that A is a 2 x 2 matrix and f(A) is another 2 x 2 matrix. 


Suppose that in (1.26) we let p — ee, so that 
КА) = УВА 
r=0 
We can attach meaning to f(A) in this case if the matrices 
р y 
ХА) = Y BA 
r=0 


tend to a constant n x n matrix in the limit as p — eo. 
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Example 1.26 


Solution 


For the matrix 


ut 


using a computer and larger and larger values of p, we infer that 


p 

. 4&4 [271828 0 

A) - lim Y < = 
ауен ar К ball 


indicating that 


aze 9 
ль; 


What would be the corresponding results if 
-1 0 -t 0 
A- ; b) A- ? 
(a) | 0 ] (b) | 0 j 


(a) The computer will lead to the prediction 


f(A) = 





(2.71828) 0 
0 201828 


indicating that 


E 

e 0 
ДА) = 

0 e 

(b) The computer is of little help in this case. However, hand calculation shows that 

we are generating the matrix 

1-r+1P-1P +... 0 
f(A) = 


0 l+rtiP tire... 


indicating that 
e' 0 
0 e 


By analogy with the definition of the scalar exponential function 


ДА) = 








22 r 
ааа Ф +...=у 
2! r 
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Theorem 1.3 


Example 1.27 


Solution 


it is natural to define the matrix function e^"', where t is a scalar parameter, by the power 
series 


f(A) = yar (1.27) 


In fact the matrix in part (b) of Example 1.26 illustrates that this definition is reasonable. 

In Example 1.26 we were able to spot the construction of the matrix f(A), but this 
will not be the case when А is a general n x n square matrix. In order to overcome this 
limitation and generate a method that will not rely on our ability to ‘spot’ a closed form 
of the limiting matrix, we make use of the Cayley-Hamilton theorem, which may be 
stated as follows. 


Cayley-Hamilton theorem 


A square matrix A satisfies its own characteristic equation; that is, if 

A" + Cy A +O, MP? +... Fe AtG=0 
is the characteristic equation of an n x n matrix A then 

EIN rc IN +e AH.. AlO (1.28) 
where l is the n x n identity matrix. 


end of theorem 


The proof of this theorem is not trivial, and is not included here. We shall illustrate the 
theorem using a simple example. 


Verify the Cayley-Hamilton theorem for the matrix 
A= 4 
1 2 


The characteristic equation of A is 


3-A 4 
| |+ ог À4—-5A«-2z20 
1 2-À 
Since 
А? — 3 4/3 4 _ |13 20 
1 2111 2 5 8 
we have 


AbsAacp-| t 20| +2! 9g 
5 8 1 2 0 1 


thus verifying the validity of the Cayley-Hamilton theorem for this matrix. 


Example 1.28 
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In the particular case when A is a 2 x 2 matrix with characteristic equation 
c(À) 2 +aA+a,=0 (1.29) 
it follows from the Cayley-Hamilton theorem that 
с(А) = А?*+ а,А + а1= 0 


The significance of this result for our present purposes begins to appear when we 
rearrange to give 


A = -a A- al 


This means that A’ can be written in terms of A and A’ = I. Moreover, multiplying by 
A gives 


A’ = -a,A’ — a,A=—a,(-a,A— a1) – а,А 





Thus Æ’ can also be expressed in terms of A and A’ = l; that is, in terms of powers of 
A less than n = 2, the order of the matrix A in this case. It is clear that we could continue 
the process of multiplying by A and substituting A’ for as long as we could manage the 
algebra. However, we can quickly convince ourselves that for any integer r = n 


A= œd + œA (1.30) 


where œ, and œ are constants whose values will depend on r. 

This is a key result deduced from the Cayley-Hamilton theorem, and the determina- 
tion of the a; (i= 0, 1) is not as difficult as it might appear. To see how to perform the 
calculations, we use the characteristic equation of A itself. If we assume that the eigen- 
values A, and A, of A are distinct then it follows from (1.29) that 


CA) =Ai+aA;+a=0 (і= 1, 2) 
Thus we can write 
А= -аА, – а 
in which a, and a, are the same constants as in (1.29). Then, for i = 1, 2, 


Aj = -a,Aj- aA, =—-a,(-a,A, - a) — aA, 





Proceeding in this way, we deduce that for each of the eigenvalues A, and А, ме 
can write 


Ат = Qo + GA; 


with the same o, and o, as in (1.30). This therefore provides us with a procedure for 
the calculation of A" when r => n (the order of the matrix) is an integer. 


Given that the matrix 


Е 


has eigenvalues A, 2 —1 and A, — —2 calculate A? and A, where r is an integer greater 
than 2. 
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Solution 


Example 1.29 


Solution 


Since A is a 2 x 2 square matrix, it follows from (1.30) that 
A` = œl + œA 
and for each eigenvalue A; (i 2 1, 2) e, and o; satisfy 
Aj 5 о,+ о,А, 
Substituting A, 2 —1 and A, — —2 leads to the following pair of simultaneous equations: 
(-1)° = a, + @(-1), (-2y 2 o + ол(—2) 
which can be solved for ox, and o to give 
00= 20-1) – (-2), о= (-1)- (2) 
Then 


ос а! ере 9 | 
0 1 -2 -3 


_ | 2-1) - (27° CD -62° |_| 30 31 
C2YXCD'-cC2y) 2(C72y-C1y| (1-62 -@ 
Replacing the exponent 5 by the general value r, the algebra is identical, and it is easy 
to see that 


A- 





2(—1)'-(—2)' (—1)-(—2)' | 
-—2((—1)'-(—2)) 2(—2)'-(—1)' 


To evaluate o, and o in (1.27), we assumed that the matrix A had distinct eigen- 
values A, and A,, leading to a pair of simultaneous equations for o and o. What 
happens if the 2 x 2 matrix A has a repeated eigenvalue so that A, = A, = A, say? 
We shall apparently have just a single equation to determine the two constants o, and 
0. However, we can obtain a second equation by differentiating with respect to 
À, as illustrated in Example 1.29. 


Given that the matrix 


aj 


has eigenvalues A, = A, = —1, determine A”, where r is an integer greater than 2. 


Since A is a 2 x 2 matrix, it follows from (1.30) that 
A’ = œl + oA 
with o, and o; satisfying 


N= A+ GA (1.31) 
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Since in this case we have only one value of A, namely A = -1, we differentiate (1.31) 
with respect to A, to obtain 


rX = (1.32) 
Substituting A = —1 in (1.31) and (1.32) leads to 
-CDYn a =C1Y+ a =(1-NCLY 


giving 


к-с]! drew | 
UNE -1 -2 


[ber D 
Maly Clery 1), 


Having found a straightforward way of expressing any positive integer power of the 
2 x 2 square matrix A we see that the same process could be used for each of the terms 
in (1.26) for r = 2. Thus, for a 2 x 2 matrix А апа some o, and о, 


ЛА) - У ВА = а1+ 02А 


r=0 
If, as p > 9, 


F(A) = lim УВА 


exists, that is, it is a 2 x 2 matrix with finite entries independent of p, then we may write 
f(A)- У В.А’ = о + о, А (1.33) 


We are now in a position to check the results of our computer experiment with the matrix 


A- | i of Example 1.26. We have defined 
0 1 


jy == SB 


so we can write 
^'— gd t oA 
Since A has repeated eigenvalue A — 1, we adopt the method of Example 1.29 to give 


e = Ao + O, te = œ 
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Example 1.30 


Solution 


leading to 
0, — te, Qty — (1 — f)e' 
Thus 
e 0 
e^ 2 (1 Bell - te'A- el-7 
0 e 


Setting ¢= 1 confirms our inference in Example 1.26. 
Calculate e^ and sin Aż when 
A-|| c 
0 1] 


Again A has repeated eigenvalues, уі Л, = А, = 1. Thus for e^' we have 


e^! 2 gd - oA 


with 

e= Oo + O, te= œ 
leading to 

e^ _ e -te 

0 e 

Similarly, 

sin Ar- ol - oA 
with 

sin f = 00+ 01, tcost= a, 
leading to 


sin At = Ssinf  —fcost 
0 sin f 


Although we have worked so far with 2 x 2 matrices, nothing in our development 
restricts us to this case. The Cayley-Hamilton theorem allows us to express positive 
integer powers of any n x n square matrix A in terms of powers of A up to n — 1. That 
is, if A is an n x n matrix and p > n then 


n-l 
А = У ВА BE BA+...+ BA" 
r=0 
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From this we can deduce that for an n x n matrix A we may write 


fA) = Y EA 


as 
n-l Д 
f(A) = m o, A (1.342) 
st 
which generalizes the result (1.33). Again the coefficients OG, 0, ..., 0, , аге 


obtained by solving the n equations 


КА) = Sex (Оол) (1.34b) 


r=0 


where À, Az, ... , A are the eigenvalues of A. If A has repeated eigenvalues, we 
differentiate as before, noting that if A; is an eigenvalue of multiplicity m then the 
first m — 1 derivatives 


4“ a‘ n-l й 
l= Уо @=1,2,...,т-1) 


EERS 








are also satisfied by A; 





Sometimes it is advantageous to use an alternative approach to evaluate 


ЛА) = У В.А" 


г=0 
If A possesses 7 linearly independent eigenvectors then there exists a modal matrix M 
and spectral matrix A such that 


M''AM- A- diag(A, Às, ..., À,) 


Now 


M /(A)M- Y &(M A'M) - V (M AM) 


к=0 


Ms 


ч 
I 
o 


В.А" = B. diag (4i, №, ж э An) 


| 
i 
iM» 


r: 


P p p 
- diag p ВУ н.у вк) 
r=0 r=0 r=0 


= ав (/(41), (4), ...,/4,)) 


This gives us a second method of computing functions of a square matrix, since we see that 


AA) = M diag (/(44), fA). . . ..f(A,)M" (1.35) 


r: 


62 MATRIX ANALYSIS 


Example 1.31 Using the result (1.35), calculate A* for the matrix 


“з 


of Example 1.28. 


Solution A has eigenvalues A, 2 —1 and A; — —2 with corresponding eigenvectors 
e-[l -15 e-[ -2]" 


Thus a modal matrix M and corresponding spectral matrix A are 


M=- 1 1 ‚ pe -1 0 
-] -2 0 -2 


M` = 2 1 
-] -1 


Taking f(A) = A‘, we have 
diag ( f(-1), f(-2)) = diag (-1)*, (-2)) 
Thus, from (1.35), 





CD) 0 
0 (-2) 


2(-1)*-(-2)" (-1)*- (-2)* 
20(=2)*- (=1)%) 2(-2)*- (-1)* 


f(A) - M M'- 


k 











as determined in Example 1.28. 


Example 1.31 demonstrates a second approach to the calculation of a function of a 
matrix. There is little difference in the labour associated with each method, so perhaps 
the only comment we should make is that each approach gives a different perspective 
on the construction of the matrix function either from powers of the matrix itself or 
from its spectral and modal matrices. 

Later in this chapter we need to make use of some properties of the exponential 
matrix e^, where A is a constant n x n square matrix. These are now briefly discussed. 


(i) | Considering the power series definition given in (1.27) 


е^=1+ Acc LAT + 1А? +... 


i 
term-by-term differentiation gives 


Я олд 2 Ау APY... = AIG APE LAPS. | 
dt 2! 3! 2! 


so that 


£e AGUA (1.36) 
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(ii) Likewise, term-by-term integration of the power series gives 


К] sa| ur ae Tdt4... 
0 0 0 | 0 


ES 1 А? + DAT. a 
so that 


al e^dt + l= e™ 


0 


giving 


| е^тат = A^ [e^ — 1] - [e^ - ПА! (1.37) 


0 


provided the inverse exists. 


A(t,+t,) At, At 
en = 2 


(iii) e (1.38) 


Although this property is true in general we shall illustrate its validity for the 
particular case when A has n linearly independent eigenvectors. Then, from (1.35), 


e^ = Mdiag (e^^, eus. : e^^)M" 
e^^ — Mdiag(e^?, e?^,,., , e^ )M" 
so that 
ePM eh — Mdiag (e0, еч»), seus ef) MT! = е^) 
(iv) Itis important to note that in general 


e^ eB! + eB yt 


It follows from the power series definition that 
e^“ eBt = е'А+В)! (1.39) 
if and only if the matrices A and B commute; that is, if AB = BA. 
To conclude this section we consider the derivative and integral of a matrix A(t) = 


[a;(£], whose elements a; (t) are functions of ¢. The derivative and integral of A(¢) are 
defined respectively by 


d A(- 4 
TAN = Pa) (1.40a) 


and 


| A(t) dt = | | ау(ї) (1.40b) 


that is, each element of the matrix is differentiated or integrated as appropriate. 
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Example 1.32 Evaluate dA/d¢ and fA dt for the matrix 
Ё +1 1-3 
2. f«2t-1 


Solution — Using (1.402), 


а, 2 а 
—(t 1 ed = 
dA _ ar die а“ 3) - 1 | 
dt 

D 0 2t-42 


d а, 2 
= = 2t- 
12 a T 2t 


Using (1.40b), 


2. 
fas- ferna fe-s 7 Pettey ЕТА 


Е 13 2 _ 
E Jena 2ttcn — $f*kf-tten 
1,3 1,2 1,3 1,2 
lp+e 1-32 Cy бә lp+e 17-31 
= 3 2 + = 3 2 +С 
20. if 4f -t Gi б» м 1Ё+ї-ї 


where C is a constant matrix. 


(m Using the Symbolic Math Toolbox in MATLAB the derivative and integral of the 
matrix A(t) is generated using the commands diff (A) and int (A) respectively. 
To illustrate this confirm that the derivative of the matrix A(t) of Example 1.32 is 
generated using the sequence of commands 
syms t 
ago cM CMM ccn EIN] 
df-diff (A); 
pretty (df) 

and its integral by the additional commands 
Tedime (A) 9 
pretty (I) 


From the basic definitions, it follows that for constants o and 3 


d .Q9A , gdB 

4; (0A * ВВ) MUT +8 ЕР (1.41) 
| coms p)ar= a aarp] Bar (1.42) 
d (ag) - AdB , dA 

(AB) =A y qu P (1.43) 


Note in (1.43) that order is important, since in general AB + BA. 


35 


36 


S 


Note that in general 


dA 


d n n 
— [A(t A 
di ie dt 


1.7.1 Exercises 
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Check your answers using MATLAB or MAPLE whenever possible. 


Show that the matrix 
5 6 
A = 
2 3 
satisfies its own characteristic equation. 


Given 


use the Cayley—Hamilton theorem to evaluate 

ӘА (DA (А 

The characteristic equation of an n x n matrix A is 
Rt Cy At + God? Lii cÀAT 06-0 

so, by the Cayley—Hamilton theorem, 
A" vc, 1А" c, А"? +... + с 





+ Col = 0 


If A is non-singular then every eigenvalue is 
non-zero, 50 Cy # 0 and 


l=- (a+c A+... +A) 
Co 
which on multiplying throughout by A! gives 
А = 1 (Ат! с А... с) (1.44) 


n-l 
Co 


(a) Using (1.44) find the inverse of the matrix 


(b) Show that the characteristic equation of the 
matrix 


is 
2-32 -7А-11=0 
Evaluate A and, using (1.44), determine A !. 


38 


40 


Given 


A= 


_ WwW N 
N Re Ut 


1 
2 
3 


compute A’ and, using the Cayley—Hamilton 
theorem, compute 


A’ — 3A° + A‘ + 3A’ — 2А? + 31 


Evaluate e™ for 


oas, 1 wal! j 
1 1l i 2 





Given 
2 0 0 
A="|0 1 1 
2 
0 0 1 
show that 
0 0 0 
snA=2a-4A’=|0 1 0 
T T 
0 0 1 
Given 
A- Р+1 2-3 
5-1 Р-1+3 
evaluate 


2 
dA 
oy Ф) | A" 


Given 


A- P+1 t-1 
5 0 


evaluate A? and show that 


а д2 dA 
—(A 2А— 
a e dt 
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Singular value decomposition 


Example 1.33 


Solution 


So far we have been concerned mainly with square matrices, dealing in particular with 
the inverse matrix, the eigenvalue problem and reduction to canonical form. In this 
section we consider analogous results for non-square (or rectangular) matrices, all of 
which have important applications in engineering. 

First we review some definitions associated with non-square matrices: 


(a) A non-square m x n matrix 
А = (а;),і= 1, 2,...,тј=1,2...,п 


is said to be diagonal if all the i, 7 entries are zero except possibly for i = j. For 
example: 


2 0 


0 3 is a diagonal 3 x 2 matrix 
0 0 


whilst 


2 0 0 : . . 
is a diagonal 2 x 3 matrix 
0 3 0 


(b) The row rank of a m x n matrix A denotes the maximum number of linearly 
independent rows of A, whilst the column rank of A denotes the maximum 
number of linearly independent columns of A. It turns out that these are the same 
and referred to simply as the rank of the matrix A and denoted by r = rank(A). It 
follows that r is less than, or equal to, the minimum of m and n. The matrix A is 
said to be of full-rank if r equals the minimum of m and n. 


For the 3 x 4 matrix 


12 3 4 
A-|3 4 7 10 
2. 1.3. 5 


confirm that row rank (A) = column rank (A). 


Following the process outlined in Section 1.2.6 we reduce the matrix to row (column) 
echelon form using row (column) elementary operations. 


(a) Row rank: using elementary row operations 


12 3 4 
з 4 7 10 
2. 1 3 5 


] row 2 — 3 x row 1, row 3 — 2 x row 1 
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1 2 3 4 
0 -2 -2 -2 
0 -3 -3 -3 


| multiply row 2 by -} 


1 2 3 4 
0 1 1 1 
0 -3 -3 -3 


| row 3 + 3 x row 2 


1 2 34 
0 1 1 I| 
0 0 
which is in row echelon form and indicating that 
row rank (A) = 2 
(b) Column rank: using elementary column operations 


23 4 
4 7 10 
1 3 5 


ә шә м 


| col2 — 2 x coll, col3 — 3 x coll, col4 — 4 x coll 


1 0 0 0 
3. -2 22. -2 
2. -3. -3» =3 


| col3 — col2, col4 — col2 


1 0 0 
3 -2 0 0 
2 -3 0 0 


which is in column echelon form and indicating that 
column rank (A) = 2 

confirming that 
rank(A) = row rank (A) = column rank (A) = 2 


Note that in this case the matrix A is not of full rank. 
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1.8.1 


Singular values 


For a m x n matrix A the transposed matrix A” has dimension n x m so that the product 
AA is a square matrix of dimension m x m. This product is also a symmetric matrix 
since 


(AAT)! (А?) (А?) = AA! 


It follows from Section 1.4.7 that the m x m matrix AA' has a full set of m linearly 
independent eigenvectors Uj, u,,..., u,, that are mutually orthogonal, and which 
can be normalized to give the orthogonal normalized set (or orthonormal set) of 
eigenvectors 


^ ^ ^ 


Us U,s....,H, 


with ái, = 6; (i, 7 = 1, 2,..., m), where 6, is the Kronecker delta defined in 
Section 1.3.2. 

(Reminder: As indicated in Section 1.4.2 normalized eigenvectors are uniquely 
determined up to a scale factor of +1.) We then define the m x m orthogonal matrix Ü 
as a matrix having these normalized set of eigenvectors as its columns: 


Us d d; (1.45) 


with ÛTÛ = 007 = L,. Such a matrix is also called a unitary matrix. 
Let A,, A,,..., A,, be the corresponding eigenvalues of AA" so that 


(AA’)#;,=A,%, i=1,2,...,m 

Considering the square of the length, or norm, of the vector Ai; then from orthogonality 
[Ай]? = (Ай) (Ай) = й (А'Ай) = й:А1, = А, 

(Note: the notation [| Ай, || is also frequently used.) Since | A&;|? — 0 it follows that the 


eigenvalues A,(i = 1, 2,..., m) of the matrix AA’ are all non-negative and so can be 
written in the form 


Дд=02,і= 1, 2,..., т 
It is also assumed that they are arranged in a non-increasing order so that 
022022...20220 


Some of these eigenvalues may be zero. The number of non-zero values (accounting 
for multiplicity) is equal to r the rank of the matrix A. Thus, if rank(A) = r then the 
matrix AA’ has eigenvalues 


2 2 = 2 2 
o? >= с 2...2 оі > ) міо, =...=02= 0 


The positive square roots of the non-zero eigenvalues of the matrix AA are called the 
singular values of the matrix A and play a similar role in general matrix theory that 
eigenvalues play in the theory of square matrices. If the matrix A has rank r then it has 
r singular values 


о, ® о, ®...> о> 0 


In practice determining the singular values of a non-square matrix provides a means of 
determining the rank of the matrix. 
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Example 1.34 For the matrix 


(b) 
(c) 
(d) 


Solution (a) 


(b) 


3 
A= 1 3 
1 


Determine the eigenvalues and corresponding eigenvectors of the matrix AA’. 


Normalize the eigenvectors to obtain the corresponding orthogonal matrix U and 
confirm that U0" = І. 


What are the singular values of A? 


What is the rank of A? 


‚3 рза 10 0 2 
zd ME "MEETS ша 
1 4 2 


(Note that AAT is a symmetric matrix.) 
The eigenvalues of AA" are given by the solutions of the equation 


10-4 0 2 
JAA'-All|=| 0 10-2 4 |-0 
2 4 2-А 
which reduces to 
(12 — A10 — A420 
giving the eigenvalues as 
A, = 12, A, = 10, A; =0 
Solving the homogeneous equations 
(AA! - AjDu; 2 0 
gives the corresponding eigenvectors as: 
m [íi 2. I; u-[ <1 0]. [Li 2 5] 


The corresponding normalized eigenvectors are: 


i T T 
üj,- 1 2 L й„= 2 o 0 A= |b. 2 5 
E 466 yl? (5 4 у 430 130 130 


giving the corresponding orthogonal matrix 


— 


Sl- 
ды 


0.04082 0.8944 0.1826 
Ü-[ü, à, = 6 5 301 10.8165 -0.4472 0.3651 
0.4082 0.0000 -0.9129 


|» 
i 
Lou 
E 
о 


е 
a 
o 
e 


e 
M 


Sl 
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By direct multiplication 


5d 
[ә 
|- 


1 1 
46 ys Boll æ 6 6 1 0 0 

ppt |e o GOES ы = 

UU - 4 ms ws s 0/2|0 1 0 
L ọọ E 2 5 0 0 1 
J6 30 || 730 30 30 


confirming that UOT = 1. 


(c) The singular values of A are the square roots of the non-zero eigenvalues of AA‘. 
Thus the singular values of A are o; — /12 and o, = 10. 


(d) The rank of A is equal to the number of singular values giving rank (A) = 2. This 
can be confirmed by reducing A to echelon form. 


Likewise, for a m x n matrix A the product A'A is a square n x n symmetric matrix, 


having a full set of n orthogonal normalized eigenvectors £,, 05, . . . , Ê, which form the 
columns of the n x n orthogonal matrix V: 

V= [ô 0,...0, (1.46) 
and having corresponding non-negative eigenvalues 4t, 41b, . . . , Li, with 

щ2 ш 2...2 ц, 2 0 (1.47) 


Again the number of non-zero eigenvalues equals r, the rank of A, so that the product 
АТА has eigenvalues 


U F l F... > и„> 0ай Ду =...=Ш„=0 
Thus 

A'A, = uô, um 0(21,2,...,r) (1.48) 
Premultiplying by A gives 

(ААТ )(Ай,) = и(Айб,) 


so that u, and (A?) are an eigenvalue and eigenvector pair of the matrix AA; indicating 
that the non-zero eigenvalues of the product AA" are the same as the non-zero eigen- 
values of the product ATA. Thus if A is of rank r then the eigenvalues (1.47) of the 
product ATA may be written as 


05121,2,.....f 
aer. 
In general the vector (A$;) is not a unit vector so 
Ай, = Кй, (1.49) 
and we need to show that k 2 o;. Taking the norm of (A;) gives 
|Aó;P? = (Ар) (Ар) 


= 0; U0; from (1.48) 


Example 1.35 


Solution 
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giving 
[А2] = к= О, 
It follows from (1.49) that 


Ай,= 


сбй„ї=1,2‚,...,г 
{ (1.50) 


0,1= 7+ 1,...,т 


Clearly the singular values of A may be determined by evaluating the eigenvalues of 
the product AA’ or the product A‘A. The eigenvectors it,, f,,..., i, of the product 
AA’ (that is the columns of 0) are called the left singular vectors of A and the eigen- 
vectors 0,, 0),..., Ô, of the product A‘A (that is columns of P) are called the right 
singular vectors of A. 


For the matrix 


3 = 
A= 1 3 
1 1 
(a) Determine the eigenvalues and corresponding eigenvectors of the product ATA. 
(b) Normalize the eigenvectors to obtain the orthogonal matrix V. 


(c) What are the singular values of A? 


_ [111 
ги 


The eigenvalues of ATA are given by the solutions of the equation 


3 
зт 
АА = 
и |. 3 J| 


Hes 1l 
1  dH-u 


IA"A- ul|- =0 








which reduces to 
(u— 12)(u— 10) =0 

giving the eigenvalues as 
Шш = 12, ш = 10 

Solving the homogeneous equations 
(A'"A - ul) v; 2 0 

gives the corresponding eigenvectors as 


y=! 1] = (11-1 
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1.8.2 


(b) The corresponding normalized eigenvectors are: 


poa] 1 i 
Dy =| == v=]; 5 
! [5 3 m [5 3 


giving the orthogonal matrix 


Ё = \ = 
=| 10.7071 


у 


1 1 
2B 5 nu 


N 
a 
— N 


0.7071 
-0.7071 


vl- 


(c) The singular values of A are the square roots of the non-zero eigenvalues of ATA. 


Thus the singular values of A are: 
O, = {Ш = (12 = 3.4641 апі о = 10 = 3.1623 


in agreement with the values obtained in Example 1.34. 


Singular value decomposition (SVD) 


For an m x n matrix A of rank r the m equations (1.50) can be written in the partitioned 
form 


(1.51) 


0 o, 0]0 0 





where 0, O>,..., 0, are the singular values of A. More precisely (1.51) may be 


written as 
АЙ = ОУ 
Using the orthogonality property VV" = I leads to the result 


А = ШУР" (1.52) 


Such a decomposition (or factorization) of a non-square matrix A is called the 
singular value decomposition of A, commonly abbreviated as SVD of A. It is 
analogous to the reduction to canonical (or diagonal) form of a square matrix developed 
in Section 1.6. 


Example 1.36 


Solution 


Example 1.37 
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Find the SVD of the matrix 


and verify your answer. 


The associated matrices Ü and V and the singular values of A were determined in 
Examples 1.34 and 1.35 as: 


d. 
430 1 
2 
30 


25. 
0 (30 


Sl- 
SESI 
N 
L pl- 


cs 
11 
| 
| 
= 
I 


, 0, = (12 апі о = 10 


= 

a 
sl 
Dl 


al- 


From (1.52) it follows that the SVD of A is 


= d i = 


15 130 412 0 d 
ы 2 / d 
J5 30 0 v 10 


L 
o i o ofl? 


J30 


|| 
515 ale 
iret 
IL Ble 


ses 
лә 


a 








Direct multiplication of the right hand side confirms 


3 
A-|1 3 
1 


The decomposition (1.50) can always be done. The non-zero diagonal elements of X 
are uniquely determined as the singular values of A. The matrices U and V are not 
unique and it is necessary to ensure that linear combinations of their columns satisfy 
(1.50). This applies when the matrices have repeated eigenvalues, as illustrated in 
Example 1.37. 


Find the SVD of the matrix 


о O оь н 
о Ро мо Фф 
о оо о 
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Solution 


10 0 гооо 
о ао 
о оо 
0020 
0 0 0 000 0 


The product AA" has eigenvalues A, = 4, A, = 4, A, = 1 and A, = 0. Normalized eigen- 
vectors corresponding to A, and A, are respectively 


a,=[1 0 O Of and #,=[0 0 0 17 


Various possibilities exist for the repeated eigenvalues A, = A, = 4. Two possible 
choices of normalized eigenvectors are 


á,-[0 1 O Of and й=[0 0 1 oF 
or 


áj- L[0 1 1 Oj' and âj=4[0 1 -1 of 


a2 


(Note that the eigenvectors û{ and £45 are linear combinations of £j; and £,.) Likewise 


AA! - = 


о о н 
о о о 
N oO о 
oo oc 
о о н 
of Oo 
CoO o 


0 
0 
2 
0 


о о о m 
о о bv о 


and has eigenvalues Li = 4, 1 = 4 апа и» 7 1. The normalized eigenvector correspond- 
ing to the eigenvalue LU; = | is 


6,=[1 0 oy 


and two possible choices for the eigenvectors corresponding to the repeated eigenvalue 
Hy = Hy = 4 are 


6,=[0 1 OJ" and 2,=[0 0 1 
or 
0j- L[O 1 1f and 672 i[0 1 -1J 


The singular values of A are 0; 2 2, 05 2 2 and 03 - | giving 


© © © му 
о © мо O 
Сэ. = > ә 


Considering the requirements (1.50) it is readily confirmed that 


Аб, = ой,, Ар, = оӣ, апа Аѓ, = о;й, 


1.8.3 
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so that 
0.0 1 0 rei 
hel а-а Аа о 
отоо 
0 1 0 
0 0 0 1 


reduces A to the SVD form A= U,2V1. 
Also, it can be confirmed that 


Аб! = сй;, Аб; = opfi;, Ab; = 031, 


so that the matrix pair 


0 0 1 0 
qe 0 0 1| 
Ao a А-а а 9 
1 Nen. ж 
z 9? 3 ei d 
0 0 0 I = = 


reduces A to the SVD form 
A= Ô £V] 
However, the corresponding columns of the matrix pair U,, V, do not satisfy conditions 
(1.50) and 
Az 0,xV" 
To ensure that conditions (1.50) are satisfied it is advisable to select the normalized 


eigenvectors Ô, first and then determine the corresponding normalized eigenvectors i; 
directly from (1.50). 


Pseudo inverse 


In Section 1.2.5 we considered the solution of the system of simultaneous linear 
equation 


Ax=b (1.53) 
where A is the n x n square matrix of coefficients and x is the n vector of unknowns. 
Here the number of equations is equal to the number of unknowns and a unique solution 

x=A'b (1.54) 


exists if and only if the matrix A is non-singular. 
There are situations when the matrix A is singular or a non-square т x n matrix. If 
the matrix A is a m x n matrix then: 


e if m > n there are more equations than unknowns and this represents the over 
determined case; 

e ifm < n there are fewer equations than unknowns and this represents the under 
determined case. 
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Example 1.38 


Solution 


Clearly approximate solution vectors x are desirable in such cases. This can be achieved 
using the SVD form (1.52) of a m x n matrix A. Recognizing the orthogonality of Ü 
and P the following matrix A' is defined 


A! - pr (1.55) 


where &* is the transpose of Z in which the singular values o; of A are replaced by their 
reciprocals. The matrix A is called the pseudo inverse (or generalized inverse) of the 
matrix A. It is also frequently referred to as the Moore-Penrose pseudo inverse of A. 
It exists for any matrix A including singular square matrices and non-square matrices. 
In the particular case when A is a square non-singular matrix A‘ = A‘. Since 


I: 0 
A'A- |... : ... 
0 : 0 
a solution of (1.53) is A'Ax — A'P, that is 
х= АїЬ (1.56) 


This is the least squares solution of (1.53) in that it minimizes (Ax — b)'(Ax — b), the 
sum of the squares of the errors. 


Determine the pseudo inverse of the matrix 


3 -=I 
А = |1 3 
1 1 


and confirm that ATA = I. 


From Example 1.36 the SVD of A is 


X 
Į- 
|— 


2. LL 
55. 30 | | J12 0 
21 


л ^ 2 42 42 
А = 057 = |5 5 Fo © 10] `| 
1 s| o oP 2 

J6 0 30 


The matrix X* is obtained by taking the transpose of £X and inverting the non-zero 
diagonal elements, giving 


1 2 1 
к: © © a. 2 17 45 
"E ^ 2 mu : = 
A'-yxU- or | s s 0 af 16 | 
ee! m lfi 2 s 


e 
5 
e 
o 
e 


Example 1.39 


1.8 SINGULAR VALUE DECOMPOSITION 77 


Direct multiplication gives 


3 -1 
17 4 $ 60 0 
AtA = Б 1 3 = =I 
-7 16 5 1 1 0 60 


so that A‘ is a left inverse of A. However, A‘ cannot be a right inverse of A. 


We noted in the solution to Example 1.38 that whilst A‘ was a left inverse of A it was 
not a right inverse. Indeed a matrix with more rows than columns cannot have a right 
inverse, but it will have a left inverse if such an inverse exists. Likewise, a matrix with 
more columns than rows cannot have a left inverse, but will have a right inverse if such 
an inverse exists. 

There are other ways of computing the pseudo inverse, without having to use SVD. 
However, most are more restrictive in use and not so generally applicable as the SVD 
method. It has been shown that At is a unique pseudo inverse of an m x n matrix A 
provided it satisfies the following four conditions: 


AA! is symmetric 

AtA is symmetric (1.57) 
AA'A-A ` 
A'AA'- A! 


For example, if an m x n matrix Å is of full rank then the pseudo inverse may be 
calculated as follows: 


if m — n then A! - (ATA)!AT (1.582) 
ifm < n then At = AY(AAD (1.58b) 


It is left as an exercise to confirm that these two forms satisfy conditions (1.57). 


(a) Without using SVD determine the pseudo inverse of the matrix 


3 -1 
A=|1 3 
1 
(b) Find the least squares solution of the following systems of simultaneous linear 
equations 
(i) 3x-y=2 (Ш) 3x-y=2 
x+3y=4 x+3y=2 
х+у= 2 х+у=2 


and comment on the answers. 
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Solution (a) From the solution to Example 1.34 rank(A) = 2, so the matrix A is of full rank. 
Since in this case m > n we can use (1.58a) to determine the pseudo inverse as 


= 
A'CQATAPAS [HO ! 3 1 1 
Lob] p 3 1 


oa(u -H[3 1 1 
"sx. apex 35 i 


1\17 4 5| | 02833 0.0667 0.0833 
"sp d S -0.1167 0.2667 0.0833 


in agreement with the result obtained in Example 1.38. 


(b) Both (i) and (ii) are examples of over determined (or over specified) sets of 
equations Ax = b with A being an m x n matrix, m > n, b being an m-vector and 
x an n-vector of unknowns. Considering the augmented matrix (A:b) then: 


e ifrank(A:b) > rank(A) the equations are inconsistent and there is no solution 
(this is the most common situation for over specified sets of equations); 

e if rank(A:b) = rank(A) some of the equations are redundant and there is a 
solution containing n — rank(A) free parameters. 


(See Section 5.6 of Modern Engineering Mathematics.) 
Considering case (i) 


3 -1 2 
A-|1 3,b-4| and =" 
1 1 2 7 
3 =] 2 
rank(A:b) = rank | 1 3. 4) =2=rank(A) from (a). 
1 1 2 


Thus the equations are consistent and a unique solution exists. The least squares 
solution is 


2 
Х| д ]!7 4 5l 
y - 16 sb [1 


which gives the unique solution x = y= 1. 
Considering case (ii) A and x are the same as in (i) and b = [2 2 2]" 


3 -1 2 
rank(A:b) rank| 1 3 2ļ|=3 > rank(A)=2 
1 1 2 
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Thus the equations are inconsistent and there is no unique solution. The least 
squares solution is 


2 
х 17 4 5 13 
== i 2-8 
y -7 16 5 2 7 


ni oB 02 
giving x= 15 апау = =. 


As indicated earlier, the least squares solution x = Atb of the system of equations Ax = b 
is the solution that minimizes the square of the error vector r = (Ax — b); that is, mini- 
mizes (Ax — b)' (Ax — b). 

In practice, data associated with individual equations within the set may not be 
equally reliable; so more importance may be attached to some of the errors r, To 
accommodate for this, a weighting factor (positive number) w; is given to the i” equa- 
tion (i= 1, 2,..., m) and the least squares solution is the solution that minimizes the 
square of the vector W(Ax — b), where W is the is the n x n diagonal matrix having 
the square roots Jw; of the weighting factors as its diagonal entries; that is 


jw, 0 > 0 
w- 0 dw; 0 
0 1 JW 


The larger w; the closer the fit of the least squares solution to the i4, equation; the 
smaller w; the poorer the fit. Care over weighting must be taken when using least 
squares solution packages. Most times one would notice the heavy weighting, but in 
automated systems one probably would not notice. Exercise 49 serves to illustrate. 


In MATLAB the command 
svd (A) 

returns the singluar values of A in non-decreasing order; whilst the command 
[U,S,V]=svd (A) 


returns the diagonal matrix S= and the two unitary matrices U= Ü and V — P such 
that A= USV". The commands: 

A=sym (A) ; 

svd (A) 


return the singular values of the matrix A in symbolic form. Symbolic singular vec- 
tors are not available. The command: 


pinv (A) 
returns the pseudo inverse of the matrix A using the SVD form of A. 
Using the matrix A of Examples 1.35, 1.36, 1.38 and 1.39 the commands 
=з к= M; erar QNT LIE 
[U, S, V] -svd (A) 
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return 
-0.4082 @ Бл! = 26 
Ш==@ © 5 =й луй =0.3651 
E0710) 6:7 EE CORTO TOTO 0.9129 


3.4641 0 
S= 0 CIE) 
0 0 


ved FOV IL 
ОО УЛ 
The additional command 
pinv (A) 
returns the pseudo inverse of A as 
(BEES MEM CONSETO 0) 008305 
ESO PITE O.4667 0-0633 
The commands: 
ЕЕ TETTE 
а=зуш (А); 
S=svd (A) 
return 
mo Cle) 
LOSA) 
In MAPLE the commands 
with(LinearAlgebra): 
д Мт (КШ ЕТЕТ ЕТА) 
СО lead te к е е Ло Е Е В 


return 
=(..4082 0.8944 =0). 1626 3.4641 


БМ БО ШОБИ ТААСИ ОСБ 13-1623; 
-0.4082 -0.0004 0.9129 0.0000 


=O. 7071 0, 707i 
ОЛЕ ШОЛ 


where the singular values are expressed as a vector. To output the values of u and 
Vt separately and to output the singular values as a matrix the following additional 
commands may be used 
Ire Swell, 
Wr S exeo [L3 1 5 
SS:-matrix(3,2,(i,j) — if i-j then svd[2][i]else 0 
fi);#output the singular values into a 3 2 matrix 
The further command 
URS SRN E 


gives the output 
EESTI) OC E TC DO C) 


1.0000 3.0000 
1.000 1.000 
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43 


44 


45 


confirming that we reproduce A. 


To obtain the pseudo inverse using MAPLE the normal matrix inverse command 


is used. Thus the commands 


with(LinearAlgebra): 


A:=Matrix([[3,-1],[1,3],[1,1]]); 


MatrixInverse (A) ; 


return 


e 
ч 
|^ 


a 
o 
B 
(л 
B 
№ 





Si» 
ale 
= 
№ 


in agreement with the answer obtained in Example 1.38. 


1.8.4 Exercises 


Use MATLAB or MAPLE to check your answers. 


Considering the matrix 46 
12.3 4 
А= 3 4 7 10 
2 1 5 7 


(a) Determine row rank (A) and column rank (A). 
(b) Is the matrix A of full rank? 
(a) Find the SVD form of the matrix 


Ae 14 
8 7 - 


(b) Use SVD to determine the pseudo inverse A‘ of 
the matrix A. Confirm that 45A = Г. 


(c) Determine the pseudo inverse without using 
SVD. 


Show that the matrix Am 
1 1 
3 0 
A-|-2 1 
0 2 
-1 2 


is of full rank. Without using SVD determine its 
pseudo inverse AÝ and confirm that A'A = I. 


Considering the matrix 


Ls et 
A=/-2 2 
2, = 2 


(a) What is the rank of A? 
(b) Find the SVD of A. 


(c) Find the pseudo inverse AÝ of A and confirm 
that AAA = A and AAAŻ = AÈ. 


(d) Find the least squares solution of the 
simultaneous equations 


x-y = l, -2x + 2у = 2, 2х- 2у = 3 


(e) Confirm the answer to (d) by minimizing the 
square of the error vector 


(Ax — b) whereb- [1 2 3]. 


Considering the matrix 


А= 1 3 
1 
(a) Use the pseudo inverse A‘ determined in 


Example 1.38 to find the least squares solution 
for the simultaneous equations 


3х-у= 1,х+3у=2,х+у= 3 
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48 


49 


(b) Confirm the answer to (a) by minimizing the 


square of the error vector 
(Ax — b) whereb-[1 2 3]. 


(c) By drawing the straight lines represented by the 
equations illustrate your answer graphically. 


Considering the matrix 


1 0 -2 

A-0 1 -l 
-1 1 I] 50 

2 -1 2 


(a) Show that A is of full rank. 
(b) Determine the pseudo inverse AŤ. 


(c) Show that the A‘ obtained satisfies the four 
conditions (1.57). 


Find the least squares solution of the following 
pairs of simultaneous linear equations. 


(a) (i) 2x +y =3 (ii) 2x - y 23 
x+2y=3 х+2у=3 
х+у=2 х+у=3 

(b) (i) 2x +y =3 (ii) 2x - y 23 
x+2y=3 х+2у=3 
10х + 10у = 20 10х + 10у = 30 

(c) (i) 2x x y 23 (ii) 2x - y 23 
х+2у=3 х+2у=3 





100х + 100у = 200 100х + 100у = 300 


Comment on your answers. 


By representing the data in the matrix form Az = y, 
where z = [m c]!, use the pseudo inverse to find the 
values of m and c which provide the least squares fit 
to the linear model y = mx + c for the following data. 


k 1 2 3 4 5 
x, | 0 1 3 3 4 
y | 1 1 2 2 3 





(Compare with Example 2.17 in Modern 
Engineering Mathematics.) 


State-space representation 


In Section 10.11.2 of Modern Engineering Mathematics it was illustrated how the solu- 
tion of differential equation initial value problems of order n can be reduced to the 
solution of a set n of first-order differential equations, each with an initial condition. In 
this section we shall apply matrix techniques to obtain the solution of such systems. 


1.9.1 


Single-input-single-output (SISO) systems 


First let us consider the single-input-single-output (SISO) system characterized by 
the nth-order linear differential equation 


n п-1 
a, a a, ma xac 4, 9€ 
dr" dt” dt 


where the coefficients a; (i 2 0, 1, ... 


+ doy = u(t) 


(1.59) 


, 1) are constants with a, # 0 and it is assumed 


that the initial conditions y(0), y 0), ... , y ^ (0) are known. 
We introduce the n variables x,(t), x,(f), ..., x,(t) defined by 


OERO 
x?) - £ - 40 


2 


ху) 7 £2 - xq) 
df 
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n-2 


x,1(t) = = i,t) 
dt 


n-l 
z(t) =H - x, (0) 
dt 


where, as usual, a dot denotes differentiation with respect to time t. Then, by substitut- 
ing in (1.59), we have 
а,Х, + Qs XS а„—›Х„ +... + ах) + ауху = U(f) 
giving 
d, d, а а 
п ix, _ “н-2 I 0 


1 
Kpelle ee ee = yo И 
а, а, а, а, а, 





х= 


Thus, we can represent (1.59) as a system of n simultaneous first-order differential 
equations 


х= 25 
X2 = X3 
Xy — X, 
: a a a, 1 
х= Hx- n - 1х, и 
а, а, а, а, 


which may be written as the vector—matrix differential equation 


X, 0 1 0 gw 0 0 X1 
Xo 0 0 1 255.5 0 0 X5 
АСЕ жр шс) (1.60) 
X, 0 0 0 кж 0 1 Xn-1 0 
. Z 74 -0› T4n-2 Ahn 1 
X — — = 00. н —— X, = 
а, аһ а, а, а, a, 
(Note: Clearly x,, x4, .. . , x, and u are functions of t and strictly should be written as 
x,(f), x,(f), ..., x,(f) and u(t). For the sake of convenience and notational simplicity the 
argument (f) is frequently omitted when the context is clear.) 
Equation (1.60) may be written in the more concise form 
x= Ax + bu (1.61а) 


The vector x(f) is called the system state vector, and it contains all the information that 
one needs to know about the behaviour of the system. Its components are the n state 
variables x,, x;, ... , x,, which may be considered as representing a set of coordinate 
axes in the n-dimensional coordinate space over which x(f) ranges. This is referred to 
as the state space, and as time increases the state vector x(f) will describe a locus in this 
space called a trajectory. In two dimensions the state space reduces to the phase plane. 
The matrix A is called the system matrix and the particular form adopted in (1.60) is 
known as the companion form, which is widely adopted in practice. Equation (1.61a) 
is referred to as the system state equation. 
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The output, or response, of the system determined by (1.59) is given by y, which in 
terms of the state variables is determined by x,. Thus 


X1 
X» 
y=[1 0 0]|. 
Xn 
or, more concisely, 
у=с'х (1.610) 
wherec=[1 0 ... OJ’. 


A distinct advantage of the vector—matrix approach is that it is applicable to 
multivariable (that is, multi-input—multi-output MIMO) systems, dealt with in Section 
1.9.2. In such cases it is particularly important to distinguish between the system state 
variables and the system outputs, which, in general, are linear combinations of the 
state variables. 

Together the pair of equations (1.61a,b) in the form 


X= Ax + bu (1.62a) 
у= сїх (1.62b) 


constitute the dynamic equations of the system and are commonly referred to as the 
state-space model representation of the system. Such a representation forms the basis 
of the so-called *modern approach' to the analysis and design of control systems in 
engineering. An obvious advantage of adopting the vector-matrix representation (1.62) 
is the compactness of the notation. 

More generally the output y could be a linear combination of both the state and input, 
so that the more general form of the system dynamic equations (1.62) is 


х= Ах + bu (1.63а) 
у= сіх + du (1.63b) 
Comment 
It is important to realize that the choice of state variables xj, x;, ... , x, is not unique. 


For example, for the system represented by (1.59) we could also take 


n-2 


n-l 
d) ys pu T 


dr"! , Ы dr? e X$4—y 


leading to the state-space model (1.62) with 





Lün-i _@а—2 .81 o L 
a, а, а, п а, 0 
1 0 0 0 0 
А= ‚ $5| |, c= (1.64) 
0 0 : 
1 


Example 1.40 


Solution 
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Obtain a state-space representation of the system characterized by the third-order 
differential equation 


3 2 
dy 4 3404 дует (1.65) 
dt dt dt 
Writing 


х= у, љ=9 =, x, Was 
dt dt 


we have, from (1.65), 
3 2 
= СУ = ду- 29У 345 167 Las 2x 3 e" 
dt dt dt 


Thus the corresponding state equation is 


X 0 1 0 X1 0 

х,|=|0 0 I1|[x *[0je" 

x, 4 -2 -3| x 1 
with the output y being given by 


X1 
y-xi-[1 0 0]|х, 


X3 


These two equations then constitute the state-space representation of the system. 


We now proceed to consider the more general SISO system characterized by the 
differential equation 
dy +a d'y + tay-b d'u < 
‚йү өн» Y = bm ce FOU Cm =n) (1.66) 
dt dt dt 
in which the input involves derivative terms. Again there are various ways of representing 
(1.66) in the state-space form, depending on the choice of the state variables. As an illus- 
tration, we shall consider one possible approach, introducing others in the exercises. 
We define A and b as in (1.60); that is, we take A to be the companion matrix of the 
left-hand side of (1.66), giving 


0 0 T 0 0 
0 0 1 

A= 
0 0 0 0 1 
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Example 1.41 


Solution 


Figure 1.3 

Block diagram for the 
state-space model of 
Example 1.41. 


and we take b=[0 0 ... O0 1]. In order to achieve the desired response, the 
vector c is then chosen to be 


e2[by bp o. b, O0 .. Of (1.67) 


It is left as an exercise to confirm that this choice 1s appropriate (see also Section 5.7.1). 


Obtain the state-space model for the system characterized by the differential equation 
model 


3 2 2 
CY gd y 110,3, 2 5d uu du, (1.68) 
dt dt dt dt dt 


Taking A to be the companion matrix of the left-hand side in (1.68) 


0 1 0 
А=| 0 0 1| and b=[0 0 1]" 
-3 -11 -6 


we have, from (1.67), 
c-[1 1 5]* 
Then from (1.62) the state-space model becomes 


X=Ax+bu, y=c'x 


This model structure may be depicted by the block diagram of Figure 1.3. It provides 
an ideal model for simulation studies, with the state variables being the outputs of the 
various integrators involved. 





A distinct advantage of this approach to obtaining the state-space model is that A, b 
and c are readily written down. A possible disadvantage in some applications is that the 
output y itself is not a state variable. An approach in which y is a state variable is 
developed in Exercise 56, Section 5.7.2. In practice, it is also fairly common to choose 
the state variables from a physical consideration, as is illustrated in Example 1.42. 


1.9.2 


Example 1.42 


Figure 1.4 
Parallel circuit of 
Example 1.42. 


Solution 
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Multi-input-multi-output (MIMO) systems 


Many practical systems are multivariable in nature, being characterized by having more 
than one input and/or more than one output. In general terms, the state-space model is 
similar to that in (1.63) for SISO systems, except that the input is now a vector u(f) as 
is the output y(t). Thus the more general form, corresponding to (1.63), of the state- 
space model representation of an mth-order multi-input—multi-output (MIMO) system 
subject to r inputs and / outputs is 


х = Ax+ Bu (1.692) 
у= Сх+ Ви (1.69) 


where x is the n-state vector, u is the r-input vector, y is the /-output vector, A is the 
n X n system matrix, B is the n x r control (or input) matrix, and C and D are respect- 
ively / x n and / x r output matrices. 


Obtain the state-space model representation characterizing the two-input—one-output 
parallel network shown in Figure 1.4 in the form 





х= Ах+ Ви, у= сіх + йи 


where the elements x;, x;, x4 of x and u, u, of u are as indicated in the figure, and the 
output y is the voltage drop across the inductor L, (vc denotes the voltage drop across 
the capacitor C). 


Applying Kirchhoff’s second law (see Section 5.4.1) to each of the two loops in turn 
gives 


Ria ig e (1.70) 
dt 
ái 

L, +UC= ез (1.71) 


The voltage drop v. across the capacitor C is given by 
; lu 
Uc — ce ti) (1.72) 


The output y, being the voltage drop across the inductor L,, is given by 


di, 


SL 
TOTEM 
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51 


52 


which, using (1.70), gives 
y-7-Rji — vet & 


Writing x, = 1), X) = bh, X3 = Ve, Uy = 
representation as 


(1.73) 


e, and uy = e,, (1.70)—(1.73) give the state-space 





К I 1 
— 0 -— — 0 
1 i LS ES 
а в о An 
L, І, и» 
: 1 1 
X3 с € 0 x 0 0 
Xi 
uy 
yolk) 9 -lIx-H "| | 
и? 
X3 
which is of the required form 
x=Ax+ Bu 
y=extd'u 
1.9.3 Exercises 
Obtain the state-space forms of the differential 53 Obtain the state-space model of the single-input— 
equations single-output network system of Figure 1.5 in the 
а а а form x = Ax + bu, y = c'x, where u, y and the 
(a) = +4 - +524 4y — u(t) elements x), x;, x4 of x are as indicated. 
d dt dt 
4 2 : Vo =X 
(p) £2 4.2 £9 L4 = sur) E 
dt dt dt 
using the companion form of the system matrix in : 
each case. 
Obtain the state-space form of the differential 
equation models 
3 2 2 " í 
(a) dy ae dy 45 dy, Je du +3 du, 5u Figure 1.5 Network of Exercise 53. 
аг аг dt dê dt 
54 The mass-spring-damper system of Figure 1.6 


3 2 2 
(b) Чу абу ЧУ Фи уйи о 
а а dt de dt 


using the companion form of the system matrix in 
each case. 


models the suspension system of a quarter-car. 
Obtain a state-space model in which the output 
represents the body mass vertical movement y 

and the input represents the tyre vertical movement 


u(t) due to the road surface. All displacements are 
measured from equilibrium positions. 


ay 







absorber 


Axle and 
wheel mass 


Road profile 


Figure 1.6 Quarter-car Suspension model of 
Exercise 54. 
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Obtain the state-space model, in the form 

X 2 Ax + Би, у = Cx + du of the one-input- 
two-output network illustrated in Figure 1.7. The 
elements x;, x; of the state vector x and y,, y; of 
the output vector y are as indicated. If R, = 1 kO, 
R, = 5kQ, R, = Ry= ЗКО, С = C,= 1 pF 
calculate the eigenvalues of the system 

matrix A. 





Figure 1.7 Network of Exercise 55. 


Solution of the state equation 


In this section we are concerned with seeking the solution of the state equation 


х= Ах + Ви 


(1.74) 


given the value of x at some initial time f, to be xy. Having obtained the solution of this 
state equation, a system response y may then be readily written down from the linear 
transformation (1.69b). As mentioned in Section 1.9.1, an obvious advantage of adopt- 
ing the vector-matrix notation of (1.74) is its compactness. In this section we shall see 
that another distinct advantage is that (1.74) behaves very much like the corresponding 


first-order scalar differential equation 


dx 


— —ax-cbu, x(t) =x, 
dt (fo 0 


1.10.1 


Direct form of the solution 


(1.75) 


Before considering the nth-order system represented by (1.74), let us first briefly review the 
solution of (1.75). When the input u is zero, (1.75) reduces to the homogeneous equation 


dx = 
dt 
which, by separation of variables, 


| e. | a dt 
* $ 


Xo 0 


ax 


gives 


In x — In x, 2 a(t — t) 


(1.76) 
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leading to the solution 
х= хое" (1.77) 


for the unforced system. 
If we consider the nonhomogeneous equation (1.75) directly, a solution can be 
obtained by first multiplying throughout by the integrating factor e to obtain 


em - ах) =e“ bu(t) 


or 
Aex) Sg" Butt) 


which on integration gives 


t 
e" x-e y - | e ^" bu(T)dc 
to 
leading to the solution 
t 
x(t) = e "y, 4 | e^ ^? bu(v)dc (1.78) 
to 
The first term of the solution, which corresponds to the solution of the unforced system, 
is a complementary function, while the convolution integral constituting the second 
term, which is dependent on the forcing function u(t), is a particular integral. 
Returning to (1.74), we first consider the unforced homogeneous system 
x= Ax, X(t) =X (1.79) 


which represents the situation when the system is ‘relaxing’ from an initial state. 
The solution is completely analogous to the solution (1.77) of the scalar equation (1.76), 

and is of the form 

x= ey, (1.80) 
It is readily shown that this is a solution of (1.79). Using (1.36), differentiation of (1.80) 
gives 

X= Aceh x, =Ax 
so that (1.79) is satisfied. Also, from (1.80), 


A(to-to) 


x(t) =€ x = lx = xo 


using e° = I. Thus, since (1.80) satisfies the differential equation and the initial condi- 
tions, it represents the unique solution of (1.79). 

Likewise, the nonhomogeneous equation (1.74) may be solved in an analogous man- 
ner to that used for solving (1.75). Premultiplying (1.74) throughout by e*’, we obtain 


e^ (x - Ax) 2 e"Bu(t) 
or using (1.36), 


Se" x) = e" Bu(f) 


1.10.2 


Figure 1.8 

(a) Transition 
matrix Ø(t, tọ). 
(b) The transition 
property. 

(c) The inverse 
D(t, to). 
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Integration then gives 


e™ x(t) - e Moy, = | 


to 


t 


e ^ Bu(T) іт 


leading to the solution 


$ 
Ооо | е^ Bu(t) dt (1.81) 


10 


This is analogous to the solution given in (1.78) for the scalar equation (1.75). Again it 
contains two terms: one dependent on the initial state and corresponding to the solution 
of the unforced system, and one a convolution integral arising from the input. Having 
obtained the solution of the state equation, the system output y(f) is then readily obtained 
from equation (1.69b). 


The transition matrix 


А . A(t-t,) . "РТ . 
The matrix exponential e “© ig referred to as the fundamental or transition matrix 


and is frequently denoted by (t, 1;), so that (1.80) is written as 


x(t) 7 dt, t))x, (1.82) 


This is an important matrix, which can be used to characterize a linear system, and in 
the absence of any input it maps a given state x, at any time f, to the state x(£) at any 
time f, as illustrated in Figure 1.8(a). 


x(t) X(t) x(t) 
$t, 1) 
x(t) 

Фп, to) 


Pir, to) Pito, t2) P(t, ty) 


Ф-1(1, 10) 
xlo) X(t) X(t) 


(a) (b) (c) 


Using the properties of the exponential matrix given in Section 1.7, certain properties 
of the transition matrix may be deduced. From 
eit) ent eb 


it follows that ®(¢, fo) satisfies the transition property 


Dlh, to) = (t, t) (t,, t) (1.83) 
for any fo, f, and f, as illustrated in Figure 1.8(b). From 


e^ e^ = l 


it follows that the inverse Ø~'(z, tọ) of the transition matrix is obtained by negating time, 
so that 


D(t, to) = B(-t, —ty) = P(t, A) 


for any f, and f, as illustrated in Figure 1.8(c). 


(1.84) 
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1.10.3 


Example 1.43 


Solution 


Evaluating the transition matrix 


Since, when dealing with time-invariant systems, there is no loss of generality in taking 
t = 0, we shall, for convenience, consider the evaluation of the transition matrix 


Ф(1) = Ф(1, 0) = еА 

Clearly, methods of evaluating this are readily applicable to the evaluation of 
@(t, tT) =e 

Indeed, since A is a constant matrix, 
Q(r, 1) - Ó(r— 1, 0) 


so, having obtained (f), we can write down (t, 7) by simply replacing t by t — т. 
Since A is a constant matrix the methods discussed in Section 1.7 are applicable for 
evaluating the transition matrix. From (1.34a), 


e^'— gut)eo e os(A + aA +... + o, OA (1.852) 
where, using (1.34b), the oc(f) (i 2 0, 1, ..., n — 1) are obtained by solving simul- 
taneously the n equations 

E OO та ОА (1.85b) 
where À; (j 7 1, 2, ..., n) are the eigenvalues of A. As in Section 1.7, if A has 


repeated eigenvalues then derivatives of e^, with respect to A, will have to be used. 


A system is characterized by the state equation 


Gt — t 

X(t) "MIU x(t) ў 1 u(t) 

X(t) 1 —3)]|x,(f) 1 
Given that the input is the unit step function 


0 (t<0) 
=H) = 
u(t) (t) p (ise 
and initially 
х1(0) = х,(0) = 1 


deduce the state x(t) =[x,(t) x,(t)]’ of the system at subsequent time t. 


From (1.81), the solution is given by 


x(t) =e x(0) + | ей? рит) т (1.86) 


0 


where 
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Since A is a 2 x 2 matrix, it follows from (1.852) that 
e™= a (Al + a (tA 

The eigenvalues of A are A, =—1 and À, — —3, so, using (1.85), we have 
alt) =; Be- e, ald = (е-е) 


giving 


-t 
At е 0 


He” _ e^) e 


Thus the first term in (1.86) becomes 


-t -t 
0 
x=] © IDE 


е-е") е1 ie e) 


and the second term is 


i t —(-т) 
L5 e 0 1 
| e^t moa] 19 Іат 
е 


1 -(t-+7) —3(t- 1T) 
0 0 (е -e ) 


t е9 
= 1 —(1—-т) —3(t- 1T) dt 
"HEC +e ) 


[т t 





-(t-T) 
e ( 
рт) ,Q 14-30-79 
(е +зе ) 
L 0 
[ -0 -t 
е е 
Е 16-0 1—0 B lr. 1.—3t 
z(e t3;e ) ;(е tie ) 
l-e” 
O f2 le- le” 
L3 e 6° 


=} -t 
e l-g 1 
x) Тр ат з] д Шат? = 21a 
Xe te) 2—56 е =+5е 
That is, 
х()=1, x()9 i416" 
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Using the Symbolic Math Toolbox in MATLAB the transition matrix e^' is gener- 


ated by the sequence of commands 


syms t 
A-[specify]l; 
A=sym(A); 

E-expm(t*A); 
pretty (E) 








Confirm this using the matrix A = [—1 0; 1 23] of Example 1.43. 
In MAPLE e^' is returned by the commands: 


with(LinearAlgebra): 
Е ает (ОШ COR 





1.10.4 Exercises 


[ШЕ ЭШНЕ 
MatrixExponential(A,t); 


Check your answers using MATLAB or MAPLE whenever possible. 


56 


57 


58 


Obtain the transition matrix 6f) of the system 59 
х= Ах 


where 


Verify that &(7) has the following properties: 

(a) Ф(0)=1; 

(b D) = Pl 4) B(n); ED 
(с) Ф (0) = e(-r. 


Writing x, = y and x, = dy/dt express the differential 
equation 
d'y | dy 
+2—+y=0 
? dt ^ 
61 


іп ће уесіог-таііх ѓогт х = Ах, х = [х х]. 
Obtain the transition matrix and hence solve the 
differential equation given that y = dy/dt = 1 when 
t=0. Confirm your answer by direct solution of the 
second-order differential equation. 


Solve 


subject to x(0) 2 [1 1]. 


Find the solution of 


ie 
-6 -5|| x, 6 


where u(f) 22 and x(0) 2 [1 —l]’. 
Using (1.81), find the response for t = 0 of the 
system 

ži =x + 2u 

X,-2-2x, 3x, 
to an input u(t) = e” and subject to the initial 
conditions x,(0) = 0, x,(0) = 1. 


A system is governed by the vector—matrix 
differential equation 


wel) ios @ (t > 0) 
2-3 i1 


where x(t) and u(t) are respectively the state 

and input vectors of the system. Determine 

the transition matrix of this system, and hence 
obtain an explicit expression for x(t) for the input 
u(t)=[4 3]' and subject to the initial condition 
x(0)=[1 21. 


1.10.5 
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Spectral representation of response 
We first consider the unforced system 
x(t) = Ax(t) (1.87) 


with the initial state x(f)) at time f) given, and assume that the matrix A has as distinct 
eigenvalues A; (i= 1, 2,..., m) corresponding to n linearly independent eigenvectors 
e,(i=1,2,...,m). Since the n eigenvectors are linearly independent, they may be used 
as a basis for the n-dimensional state space, so that the system state x(t) may be written 
as a linear combination in the form 


x(t) =c,(He, +...+¢,(He, (1.88) 


where, since the eigenvectors are constant, the time-varying nature of x(f) is reflected 
in the coefficients c(t). Substituting (1.88) into (1.87) gives 


é(te,+...+é,(tHe, = Afc (te +... + cte] (1.89) 
Since (A,, e;) are spectral pairs (that is, eigenvalue—eigenvector pairs) for the matrix A, 
Ае, = Ле, (= 1, 2,..., п) 
(1.89) may be written as 
[2,00) – Ас,(0)]е +... + [6,0 — А„с„(@)]е„= 0 (1.90) 


Because the eigenvectors e; are linearly independent, it follows from (1.90) that the 
system (1.87) is completely represented by the set of uncoupled differential equations 


é(t)-—A,ce(t)=0 (i21,2,...,n) (1.91) 
with solutions of the form 


A,(t-t) 


c(t) =e ci(to) 


Then, using (1.88), the system response is 


x(r) 2 Y. eto) gs (1.92) 
i-l 


Using the given information about the initial state, 


n 


X(to) = V cito) e; (1.93) 
i=l 
so that the constants c;(f9) may be found from the given initial state using the reciprocal 
basis vectors r; (i= 1, 2,..., n) defined by 
rie;7 à; 


where ó;is the Kronecker delta. Taking the scalar product of both sides of (1.93) with 
r,, We have 


rix(t) 7 V e(t)rie;m e(t) (k=1,2,...,n) 


izl 
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Solution 


which on substituting in (1.92) gives the system response 


x(t) = У х) ее, (1.94) 


і=1 


which is referred to as the spectral or modal form of the response. The terms 
1, X(t) € o e, are called the modes of the system. Thus, provided that the system 


matrix A has n linearly independent eigenvectors, this approach has the advantage of 
enabling us to break down the general system response into the sum of its simple modal 
responses. The amount of excitation of each mode, represented by rIx(f), is dependent 
only on the initial conditions, so if, for example, the initial state x(£)) is parallel to the 
ith eigenvector e; then only the ith mode will be excited. 

It should be noted that if a pair of eigenvalues A;, A; are complex conjugates then 
the modes associated with e^ and e?" cannot be separated from each other. The 
combined motion takes place in a plane determined by the corresponding eigenvectors 
e, and e, and is oscillatory. 

By retaining only the dominant modes, the spectral representation may be used to 
approximate high-order systems by lower-order ones. 


Obtain in spectral form the response of the second-order system 


art leh "Pn 


and sketch the trajectory. 


The eigenvalues of the matrix 


Ji 


are determined by 
|A- Al| 2 2 - 44-320 


that is, 


with corresponding eigenvectors 
а= 3 e-[1 -1] 
Denoting the reciprocal basis vectors by 
r-[n rol. r= [ra rop 
and using the relationships 


rie, — ô; (i,j = 1, 2) 


Figure 1.9 
Trajectory for 
Example 1.44. 
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we have 

re=rytry=1, rie =т= ғо = 0 
giving 

ry zl, ry zl, n-[i 1" 
апа 

ге, = т + 035 7 0, ne 2r = = 1 
giving 

ry —1, г» =—1, r-[i -1 
Thus 

rx(0)21-1-3,  rmx(0)-2i-1-2-i 


so that, from (1.94), the system response is 


2 
x(t) = yor x(0) ee, = rix(0) ee, + rix(0) ee, 


i=1 
That is, 
=t -3t 
x(t) 23e'e;- le "e 


which is in the required spectral form. 

To plot the response, we first draw axes corresponding to the eigenvectors e, and e,, 
as shown in Figure 1.9. Taking these as coordinate axes, we are at the point G > і) аї 
time t = 0. As ¢ increases, the movement along the direction of e, is much faster than 
that in the direction of e,, since e™ decreases more rapidly than e~”. We can therefore 
guess the trajectory, without plotting, as sketched in Figure 1.9. 
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1.10.6 


We can proceed in an analogous manner to obtain the spectral representation of the 
response to the forced system 


X(t) = Ax(t) + Bu(?) 


with x(t)) given. Making the same assumption regarding the linear independence of the 
eigenvectors e; (i = 1, 2, ... , n) of the matrix A, the vector Bu(t) may also be written 
as a linear combination of the form 


Bu(t) = Y Bite; (1.95) 


i=1 
so that, corresponding to (1.90), we have 
[é\(¢) A,c\(t) as В,(@)]е, Feret [¢,(t) ES Act) = Be, = 0 


As a consequence of the linear independence of the eigenvectors e; this leads to the set 
of uncoupled differential equations 


e(t) - Ajc(t) - B(t) 20 (= 1, 2,..., п) 
which, using (1.78), have corresponding solutions 


t 


eT aay Cae | e"? Bo) dr (1.96) 


to 


As for c;(t), the reciprocal basis vectors r; may be used to obtain the coefficients BT). 
Taking the scalar product of both sides of (1.95) with r, and using the relationships 
rie, = 6;, we have 


riBu(t) - B(t) (k21,2,...,n) 
Thus, from (1.96), 


t 
A (t-to) A (t-T) 
c(t) =e "riso | е” riBu(r)dc 
t, 


0 


giving the spectral form of the system response as 


n 


x( 7 Y. eoe; 


і=1 
Canonical representation 
Consider the state-space representation given in (1.69), namely 
x=Ax+ Bu (1.69a) 


y=Cx+Du (1.69b) 


Applying the transformation 
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x-Tz 
where T is a non-singular matrix, leads to 

Tz - АТ: + Bu 

у= СТ; + Ои 

which may be written in the form 

і= Аг+ Ви (1.97а) 

y=Cz+ Du (1.97b) 
where z is now a state vector and 

A=T'AT,  B-T^B  C-CT  D-D 


The system input-output relationship is unchanged by the transformation (see Section 
5.7.3), and the linear systems (1.69) and (1.97) are said to be equivalent. By the trans- 
formation the intrinsic properties of the system, such as stability, controllability and 
observability, which are of interest to the engineer, are preserved, and there is merit in 
seeking a transformation leading to a system that is more easily analysed. 

Since the transformation matrix T can be arbitrarily chosen, an infinite number of 
equivalent systems exist. Of particular interest is the case when T is taken to be the 
modal matrix M of the system matrix A; that is, 


Т=М=[е, е, ... е, 


where e; (i 2 1, 2, ... , n) are the eigenvectors of the matrix A. Under the assumption 
that the n eigenvalues are distinct, 


A=M'AM- A, the spectral matrix of A 
B=M'B 
C-CM  D-D 

so that (1.97) becomes 


Z=Az+M'Bu (1.98a) 

у= СМ; + ри (1.98b) 
Equation (1.982) constitutes a system of uncoupled linear differential equations 

2= А,2,+ Би (i21,2,...,n) (1.99) 
where z 2 (zi Z),....,2Z,)' and b} is the ith row of the matrix M™'B. Thus, by reducing 
(1.69) to the equivalent form (1.98) using the transformation x = Mz, the modes 
of the system have been uncoupled, with the new state variables z, (i= 1, 2,..., 7) 


being associated with the ith mode only. The representation (1.98) is called the normal 
or canonical representation of the system equations. 
From (1.78), the solution of (1.99) is 


t 
A (t-to) АМ@-т) ‚т : 
zj-e х(һ)+ | e b;u(t)dv (i-1l,...,n) 
1 


0 
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so that the solution of (1.98a) may be written as 


z(t) 2 e 9r) a | e^^?M "'Bu(r)dc (1.100) 


to 


where 


a 0 


A(t-t) 
e = 


0 ; ас 
In terms of the original state vector x(t), (1.100) becomes 


M 'x(t)) + | Me^^?M 'Ви(т)ат (1.101) 


to 


A(t-t) 


x(t) = М: = Ме 


and the system response is then obtained from (1.69b) as 
y(t) » €x(t) * Du(t) 


By comparing the response (1.101) with that in (1.81), we note that the transition matrix 
may be written as 


(1, t)) = = Me“ OM! 


The representation (1.98) may be used to readily infer some system properties. If the 
system is stable then each mode must be stable, so, from (1.101), each A, (i= 1, 2,..., 7) 
must have a negative real part. If, for example, the jth row of the matrix MB is zero 
then, from (1.99), Z, — Az; * 0, so the input u(t) has no influence on the jth mode of the 
system, and the mode is said to be uncontrollable. A system is said to be controllable 
if all of its modes are controllable. 

If the jth column of the matrix CM is zero then, from (1.98b), the response y is 
independent of z, so it is not possible to use information about the output to identify z,. 
The state z,is then said to be unobservable, and the overall system is not observable. 


A third-order system is characterized by the state-space model 


0 1 0 1 
х= |0 0 11х+ 1-3 и, у= ПП 0 O]x 
0 -5 -6 18 


where x = [x, x, x]. Obtain the equivalent canonical representation of the model 
and then obtain the response of the system to a unit step u(t) = H(t) given that initially 
x(0-[l 1 0]. 
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Solution The eigenvalues of the matrix 


0 1 0 
A=|0 0 1 
0 -5 -6 


are determined by 


e or 0 
lA-AM2| 0 -A 1 |20 
0 -5 -6-1 


that is, 
MM + 6A 0 5)20 

giving A, 2 0, А, 2 —1 and A, — —5, with corresponding eigenvectors 
asp Uh. eel el 15  dum[roes 25. 


The corresponding modal and spectral matrices are 


1 1 1 0 0 0 
M-j|0-1 -5|, A-|0-1 0 
0 L. 25 0 0 -5 


and the inverse modal matrix is determined to be 


20 25 4 
M'-Z|0 -25 -5 
0 1 1 


In this case B= [1 -3 18], so 





20 25 4| 1 20 1 
M'B-4i 0 —5 -5||-3|-3|-15| - | -i 
0 1 1/||18 15 i 


Likewise, C- [1 0 0] giving 


1 1 1 
СМ=[1 0 0]|0 -1 -5|=[1 1 1] 
0 1 25 


Thus, from (1.98), the equivalent canonical state-space representation is 


4| [o o ollz 1 
і= |2|=|0 —1_ 0||ж|+|— (1.102а) 


л 


23 0 0 -5|z 
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Zi 
у=[1 ] 112, (1.102р) 


Z3 


When u(t) = A(t), from (1.100) the solution of (1.102a) is 





ео 0 {1 0 1 
2= 10 е" 0 «0 | 0 e 0 -i dr 
е 0 —5(-т 
о о е" 0 0 ga 
where 
20 24 4||1 2 
20) 2 M'x(0)-i| 0 -25 -s||1|-|-3 
fo a affo а 
leading to 
го 0 S 1 
=|0 = o |- ‚| „Ек јат 
оо е" "| dese 
L | 4 
| a t 1+7 
= |е | -Heiet|2|-i-1e* 
Бе” = _ Ze 2. е” 





Then, from (1.102b), 


п -t -51 
у= 21+25+23= (0+) (2-10) +(5- е) 


_ Bo iat 1 -5t 
-tt; 5° 10 © 


If we drop the assumption that the eigenvalues of A are distinct then Ã = M'AM is 
no longer diagonal, but may be represented by the corresponding Jordan canonical form 
J with M being made up of both eigenvectors and generalized eigenvectors of A. The 
equivalent canonical form in this case will be 


z=Jz+M'Bu 
у= СМ; + Du 
with the solution corresponding to (1.100) being 


x(t) = Ме Мхи) ‚| Mo” M'Bu (1)dt 


to 


62 


63 


64 


65 


66 


1.10.7 Exercises 


Obtain in spectral form the response ofthe unforced 
second-order system 


: Е 
X(t) = X(t) £ 


x(t) Po 


4 


x(t), 


Ni 
Nin BIW 


67 
Using the eigenvectors as the frame of reference, 
sketch the trajectory. 
Using the spectral form of the solution given in 
(1.94), solve the second-order system 
— 2 2 
a= |T? Plx, х0) = 
2 -5 3 
and sketch the trajectory. 
68 


Repeat Exercise 62 for the system 


х= A P х(0) = А 
2 4 2 


Determine the equivalent canonical representation 
of the third-order system 


1 1 -2 -] 
X--1 2 ljx+] liu 

0 1 -1 -1 
y=[-2 1 0]х 


The solution of a third-order linear system is 
given by 


= Ft —2t —3t 
х= 40е "ey - 0e "e, * 0e "e, 
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where e, e, and e, are linearly independent vectors 
having values 


в=[1 1 Of, 
2 з 


e-[0 1 1f, 

е = |1 
Initially, at time ¢ = 0 the system state is 
x(0) 2 [1 1 IJ]F. Find o, o, and a; using 
the reciprocal basis method. 


Obtain the eigenvalues and eigenvectors of the matrix 


A- 5 4 
1 2 
Using a suitable transformation x(t) = Mz(t), reduce 
X(t) = Ax(f) to the canonical form Z(t) = Az(®, 
where A is the spectral matrix of A. Solve the 


decoupled canonical form for z, and hence solve 
for x(f) given ћаї х(0) = [1 4]. 


A second-order system is governed by the state 
equation 


we); to]? |o (t2 0) 
2.1 11 


Using a suitable transformation x(r) 2 Mz(f), reduce 
this to the canonical form 


0) = А) + Bufi) 


where A is the spectral matrix of 


В 


and B is a suitable 2 x 2 matrix. 

For the input u(#)=[4 3]" solve the decoupled 
canonical form for z, and hence solve for x(t) given 
that x(0) 2 [1 2]. Compare the answer with that 
for Exercise 60. 


In Chapter 5 we shall consider the solution of state-space models using the Laplace 
transform method and in Chapter 6 extend the analysis to discrete-time systems using 


z-transforms. 
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1.11 Епоіпеегіпе арріісаіоп: Lyapunov stability analysis 


The Russian mathematician Alexsander Mikhailovich Lyapunov (1876—1918) devel- 
oped an approach to stability analysis which is now referred to as the direct (or second) 
method of Lyapunov. His approach remained almost unknown in the English-speaking 
world for around half a century, before it was translated into English in the late 1950s. 
Publication of Lyapunov's work in English aroused great interest, and it is now widely 
used for stability analysis of linear and non-linear systems, both time-invariant and 
time-varying. Also, the approach has proved to be a useful tool in system design such 
as, for example, in the design of stable adaptive control systems. The Lyapunov method 
is in fact a *method of approach' rather than a systematic means of investigating stability 
and much depends on the ingenuity of the user in obtaining suitable Lyapunov func- 
tions. There is no unique Lyapunov function for a given system. 

In this section we briefly introduce the Lyapunov approach and will restrict con- 
sideration to the unforced (absence of any input) linear time-invariant system 


x= Ax (1.103) 


where x 7 [xi, x5, . . . , x,]' is the n-state vector and A is a constant n x n matrix. For the 
linear system (1.103) the origin x 2 0 is the only point of equilibrium. If, for any initial state 
x(0), the trajectory (solution path) x(f) ofthe system approaches zero (the equilibrium point) 
as f — oo then the system is said to be asymptotically stable. In practice the elements 
of the matrix A may include system parameters and we are interested in determining 
what constraints, if any, must be placed on these parameters to ensure system stability. 
Stability of (1.103) is further discussed in Section (5.7.1), where algebraic criteria for 
stability are presented. In particular, it is shown that stability of system (1.103) is 
ensured if and only if all the eigenvalues of the state matrix A have negative real parts. 

To develop the Lyapunov approach we set up a nest of closed surfaces, around the 
origin (equilibrium point), defined by the scalar function 


Их) = Их,х„...,х„)=С (1.104) 


where C is a positive constant (the various surfaces are obtained by increasing the 
values of C as we move away from the origin). If the function V(x) satisfies the follow- 
ing conditions: 


(a) V(x) = 0 at the origin, that is V(0) = 0; 
(b) V(x) > 0 away from the origin; 
(c) V(x) is continuous with continuous partial derivatives; 


then it is called a scalar Lyapunov function. (Note that conditions (a) and (b) together 
ensure that V(x) is a positive definite function.) We now consider the rate of change 
of V(x), called the Eulerian derivative of V(x) and denoted by V(x), along the trajectory 
of the system under investigation; that is, 
дуды дуй, VAX, 


V(x) == + 
{x oxı dt dx, dt dx, dt 





(1.105) 


where the values of X, X; . . . , X, are substituted from the given equations representing 
the system ((1.103) in the case of the linear equations under consideration). 

If V satisfies the condition 

(d) V(x) is negative definite 


then it follows that all the trajectories cross the surfaces V(x) = C in an inward direction 
and must tend to the origin, the position of equilibrium. Thus asymptotic stability has 


Example 1.46 
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been assured without having to solve the differential equations representing the system. 
The function V(x) which satisfies conditions (a)—(d) is called a Lyapunov function for 
the system being considered. 

If we start with a positive-definite V(x) and impose conditions on V(x) to be negative- 
definite, then these conditions will provide sufficient but not necessary stability criteria, 
and in many cases they may be unduly restrictive. However, if we are able to start with 
a negative-definite V(x) and work back to impose conditions on V(x) to be positive- 
definite, then these conditions provide necessary and sufficient stability criteria. 
This second procedure is far more difficult to apply than the first, although it may be 
applied in certain cases, and in particular to linear systems. 

Of particular importance as Lyapunov functions for linear systems are quadratic 
forms in the variables x,, x,,..., x, which were introduced in Section 1.6.4. These 
may be written in the matrix form V(x) = x'Px, where P is a real symmetric matrix. 
Necessary and sufficient conditions for V(x) to be positive-definite are provided by 
Sylvester's criterion, which states that all the principal minors of P of order 1, 2,..., 7 
must be positive; that is 


Pu Pu Pa 
P P 
Du 7 0, Т E > 0, |р рә рэз >0,...,|P|>0 
Pu Po 
Різ Po Ps 








Returning to the linear system (1.103) let us consider as a tentative Lyapunov function 
the quadratic form 


V(x) = x!Px 


where P is an n x n real symmetric matrix. To obtain the Eulerian derivative of V(x) 
with respect to system (1.103) we first differentiate V(x) with respect to t 


dV 
dt 


and then substitute for x’ and x from (1.103) giving 
V(x) = (Ax)'Px + x™P(Ax) 
V(x) =x"(A'P + PA)x 
or alternatively 
V(x) 2 —x'Qx (1.106) 
-Q = A'P 4 PA (1.107) 


=хРх+х Рх 
that is 


where 


To obtain necessary and sufficient conditions for the stability of the linear system 
(1.103) we start with any negative definite quadratic form —x'Qx, with an n x n 
symmetric matrix Q, and solve matrix equation (1.107) for the elements of P. The con- 
ditions imposed on P to ensure that it is positive definite then provide the required 
necessary and sufficient stability criteria. 


The vector-matrix differential equation model representing an unforced linear R-C 
circuit is 
x= -4a 4a x (i) 
2а -ба 


Examine its stability using the Lyapunov approach. 
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69 


70 


v 


Solution 


Take Q of equation (1.107) to be the identity matrix | which is positive-definite 


(thus —Q is negative-definite), then (1.107) may be written 


-1 0 —|-4a 20) |Pu 
0 - 4a -6a Pi2 


Equating elements in (ii) gives 


2 +?" P12) |-4a 4a (ii) 
P2 Pr Pr|| 20 -6a 


-80р + 44р = –1, 44р – 100р + 20р» = 0, 80р – 120р = –1 


Solving for the elements gives 
1 


= ры ы 
Pu Jog 2T ; P 20a 


10a 
so that 


zx | 
4004 6 


The principal minors of | | are |7| > 0 and " 


pos = 26 > 0. 
6 


Thus, by Sylvester's criterion, P is positive-definite and the system is asymptotically 


stable provided a > 0. 


Note that the Lyapunov function in this case was 


1 
V(x) =x'"Px = 102 (Tx? + 8хух + бх) 


1.11.1 Exercises 


Using the Lyapunov approach investigate the 
stability of the system described by the state 
equation 


же p D 72 
3 = 


Take Q to be the unit matrix. Confirm your answer 
by determining the eigenvalues of the state matrix. 


Repeat Exercise 68 for the system described by the 
state equation 


. [3 2 
х= х 
ed = 
For the system modelled by the state equation 


ЖЕ 


use the Lyapunov approach to determine the 
constraints on the parameters a and b that yield 
necessary and sufficient conditions for asymptotic 
stability. 


Condition (d) in the formulation of a Lyapunov 
function, requiring V(x) to be positive-definite, may 
be relaxed to V(x) being positive-semidefinite 
provided V(x) is not identically zero along any 
trajectory. A third-order system, in the absence of 
an input, is modelled by the state equation 


Хх = Ах 


ућеге х = [х х А] апа 


0 1 0 

А=| 0 -2 
-k 0 -1 

It is required to use the Lyapunov approach to 


determine the constraints on k to ensure asymptotic 
stability. 


1| with & being a constant scalar. 


(a) 


(b) 


Figure 1.10 
microphone. 
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In (1.106) choose Q to be the positive- 
semidefinite matrix 


00 0 a5 
Q-000 

00 1 
so that 
Vx) = -x'Qx = x 


Verify that V(x) is identically zero only at the 
origin (equilibrium point) and is therefore not 
identically zero along any trajectory. 


Using this matrix Q solve the matrix equation 
A'P+PA=-Q 


to determine the matrix P. 


1.12 


(c) Using Sylvester's criterion show that the 
system is asymptotically stable for 
0<k<6. 


A feedback control system modelled by the 
differential equation 


¥+axt+kx=0 


is known to be asymptotically stable, for k > 0, 
a > 0. Set up the state-space form of the equation 
and show that 

(х, х) = №1 + (6 + ах), х= х, =H 


is a suitable Lyapunov function for verifying 
this. 


Engineering capaciter microphone 


Many smaller portable tape recorders have a capacitor microphone built in, since such 
a system is simple and robust. It works on the principle that if the distance between the 
plates of a capacitor changes then the capacitance changes in a known manner, and 
these changes induce a current in an electric circuit. This current can then be amplified 
or stored. The basic system is illustrated in Figure 1.10. There is a small air gap (about 
0.02 mm) between the moving diaphragm and the fixed plate. Sound waves falling on 
the diaphragm cause vibrations and small variations in the capacitance C; these are 
certainly sufficiently small that the equations can be /inearized. 


Capacitor 


Air gap 


Moving diaphragm 












Insulation 






Metal frame 





Fixed plate 


We assume that the diaphragm has mass m and moves as a single unit so that its 
motion is one-dimensional. The housing of the diaphragm is modelled as a spring- 
and-dashpot system. The plates are connected through a simple circuit containing a 
resistance and an imposed steady voltage from a battery. Figure 1.11 illustrates the 
model. The distance x(t) is measured from the position of zero spring tension, F is the 
imposed force and fis the force required to hold the moving plate in position against 
the electrical attraction. The mechanical motion is governed by Newton’s equation 
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Figure 1.11 Capacitor 


і > x(f) from zero-spring-tension 
microphone model. 


position 


X 


I - dg/dt 






























f 


Qa 
k 











i> 
R Fixed Moving FO 


plate diaphragm 


mx--kx-AxX-f-F (1.108) 
and the electrical circuit equation gives 
guna’. wd Sa (1.109) 
C dt 
The variation of capacitance C with x is given by the standard formula 
at+x 


where a is the equilibrium distance between the plates. The force fis not so obvious, 
but the following assumption is standard 


2 
zo ond. Lae 
f ES Ca 


It is convenient to write the equations in the first-order form 





Furthermore, it is convenient to non-dimensionalize the equations. While it is obvious 
how to do this for the distance and velocity, for the time and the charge it is less so. 
There are three natural time scales in the problem: the electrical бте т, = АС, ће 
spring time 72 — m/k and the damping time T, = m/A. Choosing to non-dimensionalize 
the time with respect to T,, the non-dimensionalization of the charge follows: 





T= = Xz m = Y 3 Q = nl 
Ti a kala (2Cyka’) 


Then, denoting differentiation with respect to T by a prime, the equations are 


Cok 





Pany 
À 
m , 2 Е 
р-р Ф + 
ARC, О + 
ЕС, 


Q'--Q( +) ——À—— 
\(2C oka”) 


у= -Х(1 + Х)2 





Figure 1.12 Solutions 
to equations (1.111). 
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There are four non-dimensional parameters: the external force divided by the spring 
force gives the first, G — F/ka; the electrical force divided by the spring force gives the 
second, D? — (E?C,/2a)/ka; and the remaining two are 





A= RCok _ 1173 = it. = ЇЗ 
A d АЕС, т, 
The final equations are therefore 
X’=AV 
BV’=-X-V-@'+G (1.110) 


O’=-O(1+X)+D 
In equilibrium, with no driving force, G = 0 and V=X’ = V’ = Q’ =0, so that 
2 — 
Q P (1.111) 
Q(1-X)-D-0 
or, on eliminating Q, 
XA +X = -D 
From Figure 1.12, we see that there is always one solution for X < —1, or equivalently 
x < —a. The implication of this solution is that the plates have crossed. This is clearly 
impossible, so the solution is discarded on physical grounds. There are two other solu- 
tions if 
D-i-5 
Or 


2 
CS ers (1.112) 
2ka 


27 


We can interpret this statement as saying that the electrical force must not be too strong, 
and (1.112) gives a precise meaning to what ‘too strong’ means. There are two 
physically satisfactory equilibrium solutions —} « X, « 0 and -1 « X; « -1, and the 
only question left is whether they are stable or unstable. 

Stability is determined by small oscillations about the two values X, and X5, where 
these values satisfy (1.111). Writing 


Х= Х,+ е, О= 0, +, V=0 
and substituting into (1.110), neglecting terms in €”, y?, 0?, £0 and so on, gives 
є = А0 
B0'--e- 0- 2Q,n (1.113) 
n'2 (-Qie- (1 X;)m 


Equations (1.113) are the linearized versions of (1.110) about the equilibrium values. 
To test for stability, we put G = 0 and £ = Ге“, 0 = Me", n = Ne*' into (1.113): 


Га = АМ 
BMa --L- M-2Q,N 
Na =-O,L-(1+X)N 
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which can be written in the matrix form 


1, 0 А 0 1, 
«|М|\=|-1/В —-1/В  -2Q/B||M 
N sDo— 3. eX му 


Thus the fundamental stability problem is an eigenvalue problem, a result common 
to all vibrational stability problems. The equations have non-trivial solutions if 


-a A 0 
O0=\|-1/B -(1/B)-a —20,/B 
-Q, 0 “(1 X) - a 


= —[Во? + (В(1 +X) + la? + (1+ X,+ Ao + A(1 +X, - 207)VB 


For stability, o must have a negative real part, so that the vibrations damp out, and the 
Routh-Hurwitz criterion (Section 5.6.2) gives the conditions for this to be the case. 
Each of the coefficients must be positive, and for the first three 


B0, В(1+Х)+1> 0, 1+X,+A>0 

are obviously satisfied since —1 < X; < 0. The next condition is 
A(1 +X, -20)>0 

which, from (6.118), gives 
1+3X,> 0, or X;-i 


Thus the only solution that can possibly be stable is the one for which X; 7 —1 ; the other 
solution is unstable. There is one final condition to check, 


[В(1 + Х) + 1](1 + Х,+ А)— ВА(1 + Х,— 20?) > 0 
or 
В(1+Х)+1+Х+А+2ВАО? > 0 


Since all the terms are positive, the solution X; — 1 is indeed a stable solution. 

Having established the stability of one of the positions of the capacitor diaphragm, 
the next step is to look at the response of the microphone to various inputs. The char- 
acteristics can most easily be checked by looking at the frequency response, which is 
the system response to an individual input G — b e/^", as the frequency @ varies. This 
will give information of how the electrical output behaves and for which range of 
frequencies the response 1s reasonably flat. 

The essential point of this example is to show that a practical vibrational problem 
gives a stability problem that involves eigenvalues and a response that involves a 
matrix inversion. The same behaviour is observed for more complicated vibrational 
problems. 
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1.13 Review exercises (1-20) 


Check your answers using MATLAB or MAPLE whenever possible. 


| 


2 


3 


4 


Obtain the eigenvalues and corresponding 
eigenvectors of the matrices 


=1 6 12 
(a) | 0 -13 30 
| 0 -9 20 
oT 
Ет 
I EO 
E g 
ОЕК) 
[0-1 1 





Find the principal stress values (eigenvalues) 
and the corresponding principal stress directions 
(eigenvectors) for the stress matrix 


3 
Im 
1 


Re WwW N 
is j 


Verify that the principal stress directions are 
mutually orthogonal. 


Find the values of b and c for which the matrix 


2 -1 0 
A-|-1 3 
0 by @ 


Һаѕ [1 O0 1]' as an eigenvector. For these 
values of b and c calculate all the eigenvalues 


and corresponding eigenvectors of the matrix A. 


Use Gerschgorin's theorem to show that the 
largest-modulus eigenvalue A, of the matrix 


Ai 0 
A=|-1 4 -—1 
D 


is such that 2 « |A,| « 6. 

Use the power method, with starting vector 
x? -[-1 1 -1]5to find Л, correct to one 
decimal place. 


5 


(a) Using the power method find the dominant 
eigenvalue and the corresponding eigenvector 
of the matrix 


z 1 
A-|1 25 1 
TNT 3 


starting with an initial vector [1 1 1J" 
and working to three decimal places. 


(b) Given that another eigenvalue of A is 1.19 
correct to two decimal places, find the value ofthe 
third eigenvalue using a property of matrices. 

(c) Having determined all the eigenvalues of A, 
indicate which of these can be obtained by 
using the power method on the following 
matrices: (1) A *; (ii) A — 31. 


Consider the differential equations 
ex =4x+y+z 
© 2х + 5у + 42 
ағ 
ы, 
dt 


Show that if it is assumed that there are solutions 
of the form x = ae“, y = Be” and z= ye“ then 
the system of equations can be transformed into 
the eigenvalue problem 


ab il | oe a 
2 5 4|! B|-A| 
NO OZ Y 


Show that the eigenvalues for this problem 
are 5, 3 and 1, and find the eigenvectors 
corresponding to the smallest eigenvalue. 


Find the eigenvalues and corresponding 
eigenvectors for the matrix 


EC MED 
АЕ ИЕ 
M 


Write down the modal matrix M and spectral 
matrix A of A, and confirm that 


M'AM=A 
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10 


Show that the eigenvalues of the symmetric matrix ЇЙ! 
=4 
А= | 0 4 
-4 3 


are 9, 3 and —3. Obtain the corresponding 
eigenvectors in normalized form, and write down 
the normalized modal matrix M. Confirm that 


AÑ= A 


where A is the spectral matrix of A. 


In a radioactive series consisting of four different 
nuclides starting with the parent substance N, and 
ending with the stable product N, the amounts of 
each nuclide present at time ¢ are given by the 
differential equations model 


aMi м 
dt 12 
h Z 6N,- 4N; 

ous 2 AN, - 2N, 

a 


Express these in the vector—matrix form 
N=AN 


where N=[N, N, № №. Епа Һе eigenvalues 
and corresponding eigenvectors of A. Using the 
spectral form of the solution, determine N,(t) given 
that at time t= 0, N, = C and M, = N;=N,=0. 


(a) Given 
2 
A= о 
к 
use the Cayley-Hamilton theorem to find 


(i) A'-3A5- A* - 3A? - 2A? «3l 
(ii) A*, where k > 0 is an integer. 


13 


(b) Using the Cayley—Hamilton theorem, find 
e^' when 


Ji 


Show that the matrix 


3 


A- 4 


M C E 
D e 


has an eigenvalue A = 1 with algebraic 
multiplicity 3. By considering the rank of a 
suitable matrix, show that there is only one 
corresponding linearly independent eigenvector 
e,. Obtain the eigenvector e, and two further 
generalized eigenvectors. Write down the 
corresponding modal matrix M and confirm that 
M'AM- J, where J is the appropriate Jordan 
matrix. (Hint: In this example care must be taken 
in applying the procedure to evaluate the 
generalized eigenvectors to ensure that the 
triad of vectors takes the form (T?o, To, o), 
where T= A - AL with T?o- e.) 


The equations of motion of three equal masses 
connected by springs of equal stiffness are 


X=-2x+y 
ў=х-2у+2 
2=у- 22 


Show that for normal modes of oscillation 
СОБО y 7 Ycos ot, 
z=Zcosat 


to exist then the condition on A = 9? is 


0 1 22 


Find the three values of A that satisfy this 
condition, and find the ratios X: Y:Z in 
each case. 

Classify the following quadratic forms: 

(a) 2x° +? + 22 — 2xy - 2yz 

(b) 3x° + 7° + 22° — 4ху – 4х2 

(c) 16x? + 36)° + 172 + 32xy + 32xz +1 6yz 
(d) —21x* + 30xy — 12xz -— 11y° + 8yz — 22° 
(e) -x? — 39° — 52° + 2xy + 2xz + 2yz 
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14 


T5 


16 


Show thate, =[1 2 
the matrix 


3] 18 ап eigenvector of 


Il 
+ ыч 
| 
aaron 
c NRE 


| 
кошо 
колшо 
Nie 


and find its corresponding eigenvalue. Find the 
other two eigenvalues and their corresponding 
eigenvectors. 
Write down in spectral form the general 
solution of the system of differential 
equations 
dx 
е 
dt 455 
1 
dy 
= 4 -— 
doe e 


29 = 3х+3у+2 


Hence show that if x = 2, y = 4 and z = 6 
when f — 0 then the solution is 


it 


х=?2е, у=ле, z= 6e 


(a) Find the SVD form of the matrix 
12 0.9 -4 
A = 
BG EMITE 3 
(b) Use the SVD to determine the pseudo inverse 


AY and confirm it is a right inverse of A. 


(c) Determine the pseudo inverse A! without using 
the SVD. 


From (1.51) the unitary matrices Ü and P and sigma 
matrix È may be written in the partitioned form: 


where S is r x r diagonal matrix having the singular 
values of A as its diagonal elements and 0 denotes 
zero matrices having appropriate order. 


(a) Show that the SVD form of A may be 
expressed in the form 


PES US 


This is called the reduced singular value 
decomposition of A. 


(b) Deduce that the pseudo inverse is given by 
А! = #507 


(c) Use the results of (a) and (b) to determine 
the SVD form and pseudo inverse of the 


matrix 
1 -1 
A-|-2 2 
2 -2 


and check your answers with those obtained 
in Exercise 46. 


A linear time-invariant system (A, b, c) is 
modelled by the state-space equations 


X(t) = Ax(t) + bu() 
y(t) = e'x(t) 


where x(t) is the n-dimensional state vector, and 
u(t) and y(t) are the system input and output 
respectively. Given that the system matrix A 
has n distinct non-zero eigenvalues, show that 
the system equations may be reduced to the 
canonical form 


&) - AEQ) * bro) 

IÀ = ciO 
where A is a diagonal matrix. What properties of 
this canonical form determine the controllability 
and observability of (A, b, c)? 


Reduce to canonical form the system (A, b, c) 
having 


1 1 -2 
A- -1 2 | 
О 1 =! 
-1 —2 
b=] 1 c= 
=1 0 


and comment on its stability, controllability and 
observability by considering the ranks of the 
appropriate Kalman matrices [D Ab  A?b] 
апа [с A'c (A?yc]. 
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18 


19 


20 


A third-order system is modelled by the state-space 
representation 


20 2D i 6 
х=| 0 0 1|х+|0 1|# 
0C i 


wherex-2[x, x, xj'andu-[u uj. 
Find the transformation x = Mz which reduces 
the model to canonical form and solve for x(f) 
given x(0)=[10 5 2]'andu()-[t 1]. 


The behaviour of an unforced mechanical system 
is governed by the differential equation 


502 =1 0 
x(n-|3 6 -9x(),  x(0-!1 
11 1 0 


(a) Show that the eigenvalues of the system 
matrix are 6, 3, 3 and that there is only 
one linearly independent eigenvector 
corresponding to the eigenvalue 3. Obtain the 
eigenvectors corresponding to the eigenvalues 
6 and 3 and a further generalized eigenvector 
for the eigenvalue 3. 


(b) Write down a generalized modal matrix M 
and confirm that 


AM- MJ 
for an appropriate Jordan matrix J . 
(c) Using the result 
x(t) - Me''M^x(0) 
obtain the solution to the given differential 


equation. 


(Extended problem) Many vibrational systems are 
modelled by the vector-matrix differential equation 


X(r) z Ax(r) a) 


where A is a constant n x n matrix and 
x(f) 2 pa() (0) x(t)]-. By 
substituting x = e*u, show that 


Xu = Au (2) 


and that non-trivial solutions for u exist 
provided that 


|A- A20 (3) 
Let A2, A2, .. . , A2 be the solutions of (3) and 
ü, i, . . . , U, the corresponding solutions of (2). 


Define M to be the matrix having uj, 45, .. ., 
u, as its columns and S to be the diagonal matrix 
having A2, A2, ... , A2 as its diagonal elements. 
By applying the transformation x(t) = Mq(t), 
where 4(1) = [4:(0) q(t) а(0]", to (1), 
show that 


4= 54 (4) 
and deduce that (4) has solutions of the form 
qi = C; sin(ayf + 0t) (5) 


where c; and c; are arbitrary constants and 
A; =jq, with j = \(-1). 

The solutions A? of (3) define the natural 
frequencies w, of the system. The corresponding 
solutions q; given in (5) are called the normal 
modes of the system. The general solution of (1) 
is then obtained using x(t) = Mq(?). 

A mass-spring vibrating system is governed 
by the differential equations 


¥,(t) = -3x,(t) + 2x,(t) 
¥,(t) = x,(t) — 2x2(t) 


with x,(0) = 1 and x,(0) = x,(0) = x,(0) = 2. 
Determine the natural frequencies and the 
corresponding normal modes of the system. 
Hence obtain the general displacement x(t) 
апа x,(t) at time t = 0. Plot graphs of both 
the normal modes and the general solutions. 
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"BN Introduction 


Frequently the equations which express mathematical models in both engineering ana- 
lysis and engineering design involve derivatives and integrals of the models’ variables. 
Equations involving derivatives are called differential equations and those which include 
integrals or both integrals and derivatives are called integral equations or integro- 
differential equations. Generally integral and integro-differential equations are more 
difficult to deal with than purely differential ones. 

There are many methods and techniques for the analytical solution of elementary 
ordinary differential equations. The most common of these are covered in most first- 
level books on engineering mathematics (e.g. Modern Engineering Mathematics). 
However, many differential equations of interest to engineers are not amenable to ana- 
lytical solution and in these cases we must resort to numerical solutions. Numerical 
solutions have many disadvantages (it is, for instance, much less obvious how changes 
of parameters or coefficients in the equations affect the solutions) so an analytical solu- 
tion is generally more useful where one is available. 

There are many tools available to the engineer which will provide numerical solutions 
to differential equations. The most versatile of these perhaps are the major computer 
algebra systems such as MAPLE. These contain functions for both analytical and 

numerical solution of differential equations. Systems such as MATLAB/Simulink and 
Mathcad can also provide numerical solutions to differential equations problems. It 
may sometimes be necessary for the engineer to write a computer program to solve 
a differential equation numerically, either because suitable software packages are 
not available or because the packages available provide no method suitable for the 
particular differential equation under consideration. 

Whether the engineer uses a software package or writes a computer program for 
the specific problem, it is necessary to understand something of how numerical 
solution of differential equations is achieved mathematically. The engineer who 
does not have this understanding cannot critically evaluate the results provided by a 
software package and may fall into the trap of inadvertently using invalid results. In 
this chapter we develop the basics of the numerical solution of ordinary differential 
equations. 


E A Taba IM motion in a viscous fluid 


The problem of determining the motion of a body falling through a viscous fluid arises 
in a wide variety of engineering contexts. One obvious example is that of a parachutist, 
both in free fall and after opening his or her parachute. The dropping of supplies from 
aircraft provides another example. Many industrial processes involve adding particulate 
raw materials into process vessels containing fluids, whether gases or liquids, which 
exert viscous forces on the particles. Often the motion of the raw materials in the pro- 
cess vessel must be understood in order to ensure that the process is effective and 
efficient. Fluidized bed combustion furnaces involve effectively suspending particles 
in a moving gas stream through the viscous forces exerted by the gas on the particles. 
Thus, understanding the mechanics of the motion of a particle through a viscous fluid 
has important engineering applications. 
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v 


mg 


Figure 2.1 A particle 
falling through a 
viscous fluid. 


When a particle is falling through a viscous fluid it may be modelled simply in the 
following way. The force of gravity acts downwards and is opposed by a viscous drag 
force produced by the resistance of the fluid. Figure 2.1 shows a free body diagram of 
the particle which is assumed to be falling vertically downwards. If the particle's mass 
is m, the gravitational force is mg, and it is opposed by a drag force, D, acting to oppose 
motion. The displacement of the particle from its initial position is x. 

The equation of motion is 


dx = 
m-— -mg-D (2.1) 
dt 
Before we can solve this equation, the form of the drag term must be determined. 
For particles moving at a high speed it is often assumed that the drag is proportional to 
the square of the speed. For slow motion the drag is sometimes assumed to be directly 


proportional to the speed. In other applications it is more appropriate to assume that 
drag is proportional to some power of the velocity, so that 


D = ки = (Z) where, normally, 1 < œ < 2 


The differential equation (2.1) then becomes 


2 a 
ж =mg— «(&) 


dé dt 
; dx dxY* 
Le. ns t «(&) =mg (2.2) 


This is a second-order, nonlinear, ordinary differential equation for x, the displacement of 
the particle, as a function of time. In fact, for both œ = 1 and a = 2, (2.2) can be solved 
analytically, but for other values of œ no such solution exists. If we want to solve the 
differential equation for such values of œ we must resort to numerical techniques. 


Numerical solution of first-order ordinary 
differential equations 


In a book such as this we cannot hope to cover all of the many numerical techniques which 
have been developed for dealing with ordinary differential equations so we will concen- 
trate on presenting a selection of methods which illustrate the main strands of the theory. 
In so doing we will meet the main theoretical tools and unifying concepts of the area. 
In the last twenty years great advances have been made in the application of computers 
to the solution of differential equations, particularly using computer algebra packages 
to assist in the derivation of analytical solutions and the computation of numerical solu- 
tions. The MATLAB package is principally oriented towards the solution of numerical 
problems (although its Symbolic Math Toolbox and the MuPAD version are highly 
capable) and contains a comprehensive selection of the best modern numerical techniques 
giving the ability to solve most numerical problems in ordinary differential equations. 
Indeed numerical solutions can be achieved both in native MATLAB and in the Simulink 
simulation sub-system; which of these paths the user chooses to follow may well be 
dictated as much by their experience and professional orientation as by theoretical 
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2.3.1 


Figure 2.2 

The direction field 
for the equation 
dx/dt 2 x(1 — x)t. 


considerations. MAPLE, despite being mainly orientated towards the solution of sym- 
bolic problems, also contains a comprehensive suite of numerical solution routines and 
Is, in practice, just as capable as MATLAB in this area. Moreover, MAPLE gives to the 
user more control of the solution method used and includes a number of ‘classical’ 
solution methods. These classical methods include all the methods which are used, in 
this chapter, to introduce, develop and analyse the main strands of the theory mentioned 
above. For this reason, MAPLE will be featured rather more frequently than MATLAB, 
but the practising engineer is as likely to be using MATLAB for the numerical solution 
of real-world problems as using MAPLE. 

Despite the fact that professional engineers are very likely to be using these packages 
to compute numerical solutions of ordinary differential equations it is still important 
that they understand the methods which the computer packages use to do their work, for 
otherwise they are at the mercy of the decisions made by the designers of the packages 
who have no foreknowledge of the applications to which users may put the package. If 
the engineering user does not have a sound understanding of the principles being used 
within the package there is the ever present danger of using results outside their domain 
of validity. From there it is a short step to engineering failures and human disasters. 


A simple solution method: Euler’s method 


For a first-order differential equation dx/dt = f(t, x) we can define a direction field. The 
direction field is that two-dimensional vector field in which the vector at any point (t, x) 
has the gradient dx/dt. More precisely, it is the field 


LL) ti, ft xj 

VEL + f(t xy] 
For instance, Figure 2.2 shows the direction field of the differential equation dx/dt = 
x(1 — x)t. 

Since a solution of a differential equation is a function x(t) which has the property 
dx/dt = f(t, x) at all points (t, x) the solutions of the differential equation are curves in 
the (£, x) plane to which the direction field lines are tangential at every point. For 
instance, the curves shown in Figure 2.3 are solutions of the differential equation 
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Figure 2.3 Solutions 
of dx/dt 2 x(1 — xt 
superimposed on its 
direction field. 





e а 


This immediately suggests that a curve representing a solution can be obtained by 
sketching on the direction field a curve that is always tangential to the lines of the 
direction field. In Figure 2.4 a way of systematically constructing an approximation to 
such a curve is shown. 

Starting at some point (fọ, Xo), a straight line parallel to the direction field at that 
point, f(t, Xo), is drawn. This line is followed to a point with abscissa f; - ^. The ordin- 
ate at this point is xo + Af(to, Xo), which we shall call X;. The value of the direction field 
at this new point is calculated, and another straight line from this point with the new 
gradient is drawn. This line is followed as far as the point with abscissa t) + 2h. The 
process can be repeated any number of times, and a curve in the (f, x) plane consisting 
of a number of short straight-line segments is constructed. The curve is completely 
defined by the points at which the line segments join, and these can obviously be 
described by the equations 


Figure 2.4 x 
The construction of 12 
a numerical solution 
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Figure 2.5 The 





; x Analytic 
Euler-method solutions zo oma 
М 
of dx/dt 2 x^te" for - -h = 0.0125 





h = 0.05, 0.025 and 10 
0.0125. Ah -0.025 


ies: 0.05 


Wo Жы “ш Жы Эш шз чыз mee ii 


һ=ю+Й, Xi = xo + Afto, Xo) 
h=t th, X =X; + hft, X) 
b=h+h, Ху = X, hf(t, X;) 


fa = f, + h, Xr = X, T hf(t,, X,) 


These define, mathematically, the simplest method for integrating first-order differential 
equations. It is called Euler's method. Solutions are constructed step by step, starting 
from some given starting point (fp, x)). For a given fọ each different x, will give rise to 
a different solution curve. These curves are all solutions of the differential equation, but 


each corresponds to a different initial condition. 


The solution curves constructed using this method are obviously not exact solutions 
but only approximations to solutions, because they are only tangential to the direction 
field at certain points. Between these points, the curves are only approximately tangen- 
tial to the direction field. Intuitively, we expect that, as the distance for which we follow 
each straight-line segment is reduced, the curve we are constructing will become a 
better and better approximation to the exact solution. The increment / in the independent 
variable ¢ along each straight-line segment is called the step size used in the solution. 


In Figure 2.5 three approximate solutions of the initial-value problem 


a - iue", x(0)-0.91 


for step sizes h = 0.05, 0.025 and 0.0125 are shown. These steps are sufficiently small 
that the curves, despite being composed of a series of short straight lines, give the illusion 
of being smooth curves. The equation (2.3) actually has an analytical solution, which 


can be obtained by separation: 


1 


x = —— 
(1+ де‘ +С 


The analytical solution to the initial-value problem is also shown in Figure 2.5 for com- 
parison. It can be seen that, as we expect intuitively, the smaller the step size the more 


closely the numerical solution approximates the analytical solution. 
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Е) 


Ехатріе 2.1 


Solution 


MAPLE provides options in the dsolve function, the general-purpose ordinary 
differential equation solver, to return a numerical solution computed using the Euler 
method. Using this option we can easily generate the solutions plotted on Figure 2.5. 
In fact we can readily extend the figure to some smaller time steps. The following 
MAPLE worksheet will produce a figure similar to Figure 2.5 comparing the solu- 
tions obtained from the Euler method using time steps of 0.05, 0.025, 0.0125, 
0.00625, 0.003125 and the exact solution. The pattern established in Figure 2.5 can be 
seen to continue with each halving of the time step producing a solution with a yet 
smaller error when compared with the exact solution. 

> deql:=diff(x(t),t)=x(t)*2*t*exp(-t) ;initl:=x(0)=0.91; 

> #solve the differential equation with 5 different 
timesteps 
асое ае тта 
numeric,method-classical[foreuler],output-listprocedure, 
stepsize=0.05); 
хок асое ае Кыт Ced 
numeric,method-classical[foreuler],output-listprocedure, 
stepsize=0.025); 
> x3:=dsolve({deql, init1}, 
numeric,method=classical [foreuler] , output=listprocedure, 
stepsize=0.0125); 
> x4)-dsolvel(ideqi, antl}, 
numeric,method=classical [foreuler] , output=listprocedure, 
stepsize=0.00625); 














асое ае КЕ, 




















numeric,method-classical[foreuler],output-listprocedure, 
stepsize=0.003125); 
> #extract the five solutions from the listprocedure 











structures 
Пол ona te 5 corsolwcxom|| ое clos 
#find the exact solution 
xa:=dsolve({deql, init1)); 
#plot the five numerical solutions and the exact solution 
DILSE ( (Sec (Soulure wei || a. (ie) ,- n=l. .5) ,ca(2,sa) (ie) t=O. - 12) Б 


у м ММ 


The function x(t) satisfies the differential equation 


dx _x+t 
dt xt 





and the initial condition x(1) = 2. Use Euler’s method to obtain an approximation to the 
value of x(2) using a step size of h = 0.1. 


In this example the initial value of tis 1 and x(1) = 2. Using the notation above we have 
ty = 1, and x) = 2. The function f(t, x) = E So we have 


f,- fj h2 14 0.1 1.1000 
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Figure 2.6 
Computational results 
for Example 2.1. 


ug 


2.3.2 





t X X+t Xt ^+ 

Xt 
1.0000 2.0000 3.0000 2.0000 0.1500 
1.1000 2.1500 3.2500 2.3650 0.1374 
1.2000 22874 3.4874 2.7449 0.1271 
1.3000 2.4145 3.7145 3.1388 0.1183 
1.4000 2.5328 3.9328 3.5459 0.1109 
1.5000 2.6437 4.1437 3.9656 0.1045 
1.6000 2.7482 4.3482 4.3971 0.0989 
1.7000 2.8471 4.5471 4.8400 0.0939 
1.8000 2.9410 4.7410 5.2939 0.0896 
1.9000 3.0306 4.9306 5.7581 0.0856 
2.0000 3.1162 


Xo t to 2+1 


Xi = x0 + Afto x) = xo + K =2+ 0.14 = 2.1500 


Xoto 
t =t, +h= 1.1000 + 0.1 = 1.2000 


xi +t —2.1500 4 0 1 2.1500 + 1.100 


= х, + hf(t th= : 
Xi if ( 1» х) = X, Xiti 2.1500 - 1.100 


— 2.2874 


The rest of the solution is obtained step by step as set out in Figure 2.6. The approxima- 
tion X(2) = 3.1162 results. 


The solution to this example could easily be obtained using MAPLE as follows: 
eo ЕБ Б , ie) = (se (ie) He) / (ee) ie) 2 imal e =< (1) 2p 
> xl:=dsolve({deql, іпііё1}, 
numeric,method=classical [foreuler] , output=listprocedure, 
stepsize=0.1); 
2 golsoo(2. 2p D) seed (2) 


Analysing Euler's method 


We have introduced Euler's method via an intuitive argument from a geometrical 
understanding of the problem. Euler's method can be seen in another light — as an 
application of the Taylor series. The Taylor series expansion for a function x(f) gives 
x(t+h) = x(t) + nor Eio K dap a, (2.4) 
3! df 
Using this formula, we could, in theory, given the value of x(t) and all the derivatives 
of x at f, compute the value of x(t + h) for any given A. If we choose a small value for 
h then the Taylor series truncated after a finite number of terms will provide a good 
approximation to the value of x(t + 4). Euler's method can be interpreted as using 
the Taylor series truncated after the second term as an approximation to the value of 
x(t +h). 
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In order to distinguish between the exact solution of a differential equation and a 
numerical approximation to the exact solution (and it should be appreciated that all 
numerical solutions, however accurate, are only approximations to the exact solu- 
tion), we shall now make explicit the convention that we used in the last section. The 
exact solution of a differential equation will be denoted by a lower-case letter and a 
numerical approximation to the exact solution by the corresponding capital letter. Thus, 
truncating the Taylor series, we write 


X(t+h) = x(t) + nv) = x(2) + hflt, x) (2.5) 


Applying this truncated Taylor series, starting at the point (ty, x9) and denoting 4) + nh 
by ¢,, we obtain 


X(t,) = X(to + h) = x(to) + hf (to, Xo) 

X(t,) = X(t, + h) = X(t) + hft, Xi) 

X(t;) = X(t, + h) = X(b) + А, X;) 
and so on 


which is just the Euler-method formula obtained in Section 2.3.1. As an additional 
abbreviated notation, we shall adopt the convention that x(t) + nh) is denoted by x,, 
X(t + nh) by X, ltn Xn) DY f and f(t Xa) by F, Hence we may express the Euler 
method, in general terms, as the recursive rule 


Xo = Xo 
Хы = Х,+АР, (п 2 0) 


The advantage of viewing Euler's method as an application of Taylor series in this way 
is that it gives us a clue to obtaining more accurate methods for the numerical solution 
of differential equations. It also enables us to analyse in more detail how accurate 
the Euler method may be expected to be. Using the order notation we can abbreviate 
(2.4) to 


x(t + h) 2 x(f) + hft, x) + Oh 
and, combining this with (2.5), we see that 
X(t + h) =x(t+ h) + Oh’) (2.6) 


(Note that in obtaining this result we have used the fact that signs are irrelevant in 
determining the order of terms; that is, -O(h”) = O(h”).) Equation (2.6) expresses the 
fact that at each step of the Euler process the value of X(t + A) obtained has an error of 
order h°, or, to put it another way, the formula used is accurate as far as terms of order 
h. For this reason Euler's method is known as a first-order method. The exact size of 
the error is, as we intuitively expected, dependent on the size of h, and decreases as h 
decreases. Since the error is of order A’, we expect that halving h, for instance, will 
reduce the error at each step by a factor of four. 

This does not, unfortunately, mean that the error in the solution of the initial value 
problem is reduced by a factor of four. To understand why this is so, we argue as 
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Example 2.2 


Solution 


Figure 2.7 
Computational results 
for Example 2.2. 


follows. Starting from the point (tj, xj) and using Euler’s method with a step size A to 
obtain a value of X(t, + 4), say, requires 4/A steps. At each step an error of order /? is 
incurred. The total error in the value of X(f -- 4) will be the sum of the errors incurred 
at each step, and so will be 4/h times the value of a typical step error. Hence the total 
error is of the order of (4/h)O(h’); that is, the total error is O(h). From this argument we 
should expect that if we compare solutions of a differential equation obtained using 
Euler’s method with different step sizes, halving the step size will halve the error in the 
solution. Examination of Figure 2.5 confirms that this expectation is roughly correct in 
the case of the solutions presented there. 


Let X, denote the approximation to the solution of the initial-value problem 


2 
dx. x 


dí t-1' 





х(0) = 1 


obtained using Euler's method with a step size h = 0.1, and X, that obtained using a step size 
of h = 0.05. Compute the values of X, (f) and X,(f) for f= 0.1, 0.2,..., 1.0. Compare 
these values with the values of x(t), the exact solution of the problem. Compute the ratio 
of the errors іп X, and №. 


The exact solution, which may be obtained by separation, is 


"-— —8 
1- In(t 4 1) 


The numerical solutions X, and X, and their errors are shown in Figure 2.7. Of course, 
in this figure the values of X, are recorded at every step whereas those of X, are only 
recorded at alternate steps. 

Again, the final column of Figure 2.7 shows that our expectations about the effects 
of halving the step size when using Euler's method to solve a differential equation are 
confirmed. The ratio of the errors is not, of course, exactly one-half, because there are 
some higher-order terms in the errors, which we have ignored. 





t X, A, x(t) |x -Xal |х — Хь| со 
|х-Җ.| 
0.00000 1.000 00 1.000 00 1.000 00 
0.100 00 1.10000 1.102 50 1.105 35 0.005 35 0.002 85 0.53 
0.20000 1.21000 1.21603 1.22297 0.01297 0.006 95 0.54 
0.300 00 1.33201 1.34294 1.35568 0.023 67 0.012 75 0.54 
0.400 00 1.468 49 1.486 17 1.507 10 0.038 61 0.020 92 0.54 
0.500 00 1.622 52 1.649 52 1.68199 0.059 47 0.032 47 0.55 
0.600 00 1.798 03 1.83791 1.88681 0.088 78 0.048 90 0.55 
0.700 00 2.000 08 2.05792 2.13051 0.13042 0.072 59 0.56 
0.800 00 2.23540 2.31857 2.425 93 0.190 53 0.10736 0.56 
0.900 00 2.51301 2.63251 2.792 16 0.279 15 0.159 65 0.57 


1.000 00 2.845 39 3.018 05 3.258 89 0.413 50 0.240 84 0.58 
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2.3.3 


Example 2.3 


Solution 


Using numerical methods to solve engineering problems 


In Example 2.2 the errors in the values of X, and X, are quite large (up to about 14% in 
the worst case). While carrying out computations with large errors such as these is quite 
useful for illustrating the mathematical properties of computational methods, in engineering 
computations we usually need to keep errors very much smaller. Exactly how small they 
must be is largely a matter of engineering judgement. The engineer must decide how 
accurately a result is needed for a given engineering purpose. It is then up to that engineer 
to use the mathematical techniques and knowledge available to carry out the computations 
to the desired accuracy. The engineering decision about the required accuracy will usually 
be based on the use that is to be made of the result. If, for instance, a preliminary design 
study is being carried out then a relatively approximate answer will often suffice, whereas 
for final design work much more accurate answers will normally be required. It must be 
appreciated that demanding greater accuracy than is actually needed for the engineering 
purpose in hand will usually carry a penalty in time, effort or cost. 

Let us imagine that, for the problem posed in Example 2.2, we had decided we needed 
the value of x(1) accurate to 1%. In the cases in which we should normally resort to 
numerical solution we should not have the analytical solution available, so we must 
ignore that solution. We shall suppose then that we had obtained the values of X,(1) and 
X,(1) and wanted to predict the step size we should need to use to obtain a better appro- 
ximation to x(1) accurate to 1%. Knowing that the error in X,(1) should be approximately 
one-half the error in X,(1) suggests that the error in X,(1) will be roughly the same as 
the difference between the errors in X,(1) and X,(1), which is the same as the difference 
between X,(1) and X,(1); that 1s, 0.172 66. One per cent of X,(1) is roughly 0.03, that is 
roughly one-sixth of the error in X,(1). Hence we expect that a step size roughly one- 
sixth of that used to obtain X, will suffice; that is, a step size h = 0.008 33. In practice, 
of course, we shall round to a more convenient non-recurring decimal quantity such as 
h=0.008. This procedure is closely related to the Aitken extrapolation procedure some- 
times used for estimating limits of convergent sequences and series. 


Compute an approximation X(1) to the value of x(1) satisfying the initial-value problem 


dx x -1 
a i Т 


by using Euler's method with a step size / — 0.008. 


It is worth commenting here that the calculations performed in Example 2.2 could 
reasonably be carried out on any hand-held calculator, but this new calculation requires 
125 steps. To do this is on the boundaries of what might reasonably be done on a hand- 
held calculator, and is more suited to a micro- or minicomputer. Repeating the calcula- 
tion with a step size h = 0.008 produces the result X(1) 2 3.213 91. 

We had estimated from the evidence available (that 1s, values of X(1) obtained using 
step sizes h = 0.1 and 0.05) that the step size / — 0.008 should provide a value of X(1) 
accurate to approximately 196. Comparison of the value we have just computed with the 
exact solution shows that it is actually in error by approximately 1.4%. This does not quite 
meet the target of 1% that we set ourselves. This example therefore serves, first, to illustrate 
how, given two approximations to x(1) derived using Euler’s method with different step 
sizes, we can estimate the step size needed to compute an approximation within a 
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desired accuracy, and, secondly, to emphasize that the estimate of the appropriate step 
size is only an estimate, and will not guarantee an approximate solution to the problem 
meeting the desired accuracy criterion. If we had been more conservative and rounded 
the estimated step size down to, say, 0.005, we should have obtained X(1) = 3.230 43, 
which is in error by only 0.9% and would have met the required accuracy criterion. 


(m Again the solution to this example could be obtained using MAPLE. The following 
worksheet computes the numerical solution using a step size of 0.008, then the 
analytical solution and finally computes the percentage error in the numerical solution. 


#set up differential equation 

deo kS E E EE E E тт ШЕ = (уу 

#obtain xl, the numerical solution 

x1l:=dsolve({deql, initl1}, 
numeric,method-classical[foreuler],output-listprocedure, 
stepsize-0.008); 


> 
> 
> 
> 


#xa is the analytic solution 
xa:-dsolve((deq1, іпіё1}); 
побрза е ате е З 0) аи 
юн ОИ 
#find the percentage error in the numerical solution 
ewe (жр м 50650 ор(о ај) 
©ш Бе (= teresa BREST OE 


уум уу у 


Since we have mentioned in Example 2.3 the use of computers to undertake the 
repetitive calculations involved in the numerical solution of differential equations, it 1s 
also worth commenting briefly on the writing of computer programs to implement those 
numerical solution methods. Whilst it is perfectly possible to write informal, unstruc- 
tured programs to implement algorithms such as Euler's method, a little attention to 
planning and structuring a program well will usually be amply rewarded — particularly 
in terms of the reduced probability of introducing ‘bugs’. Another reason for careful 
structuring is that, in this way, parts of programs can often be written in fairly general 
terms and can be re-used later for other problems. The two pseudocode algorithms in 
Figures 2.8 and 2.9 will both produce the table of results in Example 2.2. The pseudocode 
program of Figure 2.8 is very specific to the problem posed, whereas that of Figure 2.9 
is more general, better structured, and more expressive of the structure of mathematical 
problems. It is generally better to aim at the style of Figure 2.9. 


Figure 2.8 A poorly 
structured algorithm 
for Example 2.2. 


xle 1 
х2 < 1 
write(vdu, 0, 1, 1, 1) 
for i is 1 to 10 do 
xl € x1 +0.1*х1*х1/((1—1)*0.1 + 1) 
X2 «— x2 + 0.05*x2*x2/((i-1)*0.1 + 1) 
x2 «— x2 + 0.05*x2*x2/((i-1)*0.1 + 1.05) 
x © 1/(1 — InG+0.1 + 1)) 
write(vdu,0. 1+i,x1,x2,x,x — xl;x — x2,(x — x2)/(k — x1)) 
endfor 
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Figure 2.9 A better 
structured algorithm 
for Example 2.2. 


initial time —— 0 
final time — 1 
initial x — 1 
step «— 0.1 
t — initial time 
xl — initial x 
x2 c initial x 
hl < step 
h2 < step/2 
write(vdu,initial_time,x1,x2,initial_x) 
repeat 
euler(t,x1,hl,1 — x1) 
euler(t,x2,h2,2 — x2) 
t< t+ step 
x © exact_solution(t,initial_time,initial_x) 
write(vdu,t,x1,x2,x,abs(x — x1),abs(x — x2),abs((x — x2)/(x — x1))) 
until t — final time 


procedure euler(t old,x old,step,number — x new) 
temp. x — x old 
for i is 0 to number -1 do 
temp x € temp x - step*derivative(t old - step*i;temp x) 
endfor 
x new c temp x 
endprocedure 


procedure derivative(t,x — derivative) 
derivative — x*x/(t + 1) 
endprocedure 


procedure exact_solution(t,t0,x0 — exact_solution) 
c = In(t0 + 1) + 1/x0 
exact solution — l/(c — In(t  1)) 

endprocedure 


2.3.4 Exercises 


All the exercises in this section can be completed using MAPLE in a similar manner to Examples 2.1 and 2.3 above. 
In particular MAPLE or some other form of computer assistance should be used for Exercises 5, 6 and 7. If you do 
not have access to MAPLE, you will need to write a program in MATLAB or some other high-level scientific 
computer programming language (e.g. Pascal or C). 


il Find the value of X(0.3) for the initial-value problem 3 Find the value of X(1) for the initial-value problem 


dx 1 dx x 
—=-ixt, x(0)=1 = =>—.,,,_ x(0.5) = 1 
d cade M a (9) 
using Euler's method with step size / — 0.1. using Euler's method with step size / — 0.1. 


2  Findthe value of X(1.1) for the initial-value problem 4  Findthe value of X(0.5) for the initial-value problem 


B sig x(1) 2 0.1 


- dx 4-t 
dí ? 


а = 1 
dt t+x’ x(0) 


using Euler’s method with step size h = 0.025. using Euler’s method with step size h = 0.05. 
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Denote the Euler-method solution of the initial- 
value problem 


dx — xf 


, x1)22 
dí ¢+2 


using step size h = 0.1 by X;(t), and that using 
h=0.05 by (0). Епа the values of X,(2) and 
X,(2). Estimate the error in the value of X,(2), and 
suggest a value of step size that would provide a 
value of X(2) accurate to 0.1%. Find the value of 
X(2) using this step size. Find the exact solution of 


X,(2). Estimate the error in the value of X,(2), and 
suggest a value of step size that would provide a 
value of X(2) accurate to 0.2%. Find the value of 
X(2) using this step size. Find the exact solution of 
the initial-value problem, and determine the actual 
magnitude of the errors in_X,(2), X,(2) and your 
final value of X(2). 


Denote the Euler-method solution of the initial- 
value problem 


the initial-value problem, and determine the actual 


dx d 


= = 12 
dt lnx’ x) 


magnitude of the errors in X,(2), X,(2) and your 


final value of X(2). 


Denote the Euler-method solution of the initial- 


value problem 


dx 1 


dt xt’ 


using step size A = 0.1 by X,(4), and that using 
h=0.05 by X,(f). Find the values of X,(2) and 


2.3.5 


using step size h = 0.05 by X,(¢), and that using 
h=0.025 by X,(¢). Find the values of X,(1.5) and 
X,(1.5). Estimate the error in the value of X,(1.5), 
and suggest a value of step size that would provide 
a value of X(1.5) accurate to 0.25%. Find the value 
of X(1.5) using this step size. Find the exact 
solution of the initial-value problem, and determine 
the actual magnitude of the errors in_X,(1.5), X,(1.5) 
and your final value of X(1.5). 


x(1)21 


More accurate solution methods: multistep methods 


In Section 2.3.2 we discovered that using Euler's method to solve a differential equa- 
tion is essentially equivalent to using a Taylor series expansion of a function truncated 
after two terms. Since, by so doing, we are ignoring terms O(h’), an error of this order 
is introduced at each step in the solution. Could we not derive a method for calculat- 
ing approximate solutions of differential equations which, by using more terms of the 
Taylor series, provides greater accuracy than Euler’s method? We can — but there are 
some disadvantages in so doing, and various methods have to be used to overcome 
these. 
Let us first consider a Taylor series expansion with the first three terms written 
explicitly. This gives 
2 42 
хі) = (д) + АО) + 1. 
dt 2! dt 


Substituting f(t, x) for dx/dt, we obtain 


(t) + O(h’) (2.7) 


x(t + h) = x(t) + Aft, x) + t ik t x) + O(f?) 


Dropping the O(h°) terms provides an approximation 
2 
X+ п) = х) + А, х) + i t,x) 


such that 
X(t + h) = x(t + h) + Oh’) 
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in other words, a numerical approximation method which has an error at each step that 
is not of order /? like the Euler method but rather of order /?. The corresponding general 
numerical scheme is 


k dF, 
; dt 


X, 





(2.8) 





The application of the formula (2.5) in Euler's method was straightforward because 
an expression for f(t, x) was provided by the differential equation itself. To apply (2.8) 
as it stands requires an analytical expression for df/dt so that dF,/dt may be computed. 
This may be relatively straightforward to provide — or it may be quite complicated. 
Although, using modern computer algebra systems, it is now often possible to compute 
analytical expressions for the derivatives of many functions, the need to do so remains 
a considerable disadvantage when compared with methods which do not require the 
function's derivatives to be provided. 

Fortunately, there are ways to work around this difficulty. One such method hinges 
on the observation that it is just as valid to write down Taylor series expansions for 
negative increments as for positive ones. The T series expansion of x(t — h) is 


h dx 


x(t — h) = x(t) — Eur “н iy 


3! dt 
If we write only the first three terms explicitly, we have 


х(@— В) =х(0— Ша EE oxy + OM) 


or, rearranging the equation, 


k dx 
2! d 


qe x(t — h) — x(f) + па m + O(h’) 
Substituting this into (2.7), we obtain 
x(t + h) = x(À) + no) + [x(t - h) - x(t) + ne) + O)| + OUP) 
That is, 
x(t *- h) 2 x(t А) + 2194) + O(h’) 
or, substituting f(t, x) for dx/dt, 


x(t+ h) 2 x(t — h) + 2hf(t, x) + OH) (2.9) 


Alternatively, we could write down the Taylor series expansion of the function dx/dt 
with an increment of —h: 


2 2 13 
dxa- n) = Hey -re + 2 By - OH) 
dt a dt 2! dt 
Writing only the first two terms explicitly and rearranging gives 


х= оу 0 py + OF) 
dt dt dt 
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and substituting this into (2.4) gives 
x(t + h) = x(t) + n0 + 6 - Se —h)+ d^) + O(h’) 
That is, 
h| dx dx 3 
= „о үе == = h 
x(t + h) = x(t) + [ 4:09 ac Z + O(h’) 
or, substituting f(t, x) for dx/dt, 
x(t +h) = x(t) + 5 h[3f(t, x(0)) — f(t — h, x(t — 5))] - OQ?) (2.10) 
Equations (2.7), (2.9) and (2.10) each give an expression for x(t + h) in which all 
terms up to those in /? have been made explicit. In the same way as, by ignoring terms 


of O(h°) in (2.7), the numerical scheme (2.8) can be obtained, (2.9) and (2.10) give rise 
to the numerical schemes 


Хы =, 1 + 2А, (2.11) 
апа 
Xu =X, + SAGE, Е, 1) (2.12) 


respectively. Each of these alternative schemes, like (2.8), incurs an error O(h*) at 
each step. 

The advantage of (2.11) or (2.12) over (2.8) arises because the derivative of 
f(t, x) in (2.7) has been replaced in (2.9) by the value of the function x at the 
previous time, x(t — /), and in (2.10) by the value of the function fat time t — h. This 
is reflected in (2.11) and (2.12) by the presence of the terms in X, , and F, respect- 
ively and the absence of the term in dF,/dt. The elimination of the derivative of the 
function f(t, x) from the numerical scheme is an advantage, but it is not without its 
penalties. In both (2.11) and (2.12) the value of X,,, depends not only оп the values 
of X, and F, but also on the value of one or the other at ¢,_,. This is chiefly a problem 
when starting the computation. In the case of the Euler scheme the first step took 
the form 


Ху = Ху + ҺЕ 
In the case of (2.11) and (2.12) the first step would seem to take the forms 

X, 2 X4 2hF, 
and 

Xi =% + 1ЗЕ,-— ЕЁ) 
respectively. The value of X_, in the first case and F_, in the second is not normally 
available. The resolution of this difficulty is usually to use some other method to 
start the computation, and, when the value of X, and therefore also the value of F}, 


is available, change to (2.11) or (2.12). The first step using (2.11) or (2.12) therefore 
involves 
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Example 2.4 


Solution 


Figure 2.10 
Computational results 
for Example 2.4. 


X, 2 X, * 2hF, 
or 
X,- X, 1AGF, — Fy) 


Methods like (2.11) and (2.12) that involve the values of the dependent variable or its 
derivative at more than one value of the independent variable are called multistep 
methods. These all share the problem that we have just noted of difficulties in deciding 
how to start the computation. We shall return to this problem of starting multistep methods 
in Section 2.3.7. 


Solve the initial-value problem 
dro ox 
dt t+’ 





x(0)=1 


posed in Example 2.2 using the scheme (2.12) with a step size h = 0.1. Compute the 
values of X(f) for t 2 0.1, 0.2, ..., 1.0 and compare them with the values of the exact 
solution x(t). 


We shall assume that the value of X(0.1) has been computed using some other method 
and has been found to be 1.10535. The computation therefore starts with the calculation 
of the values of F, Fy and hence X,. Using the standard notation we have t)= 0, and 
xo7 1. The function f(t, x) 7 x?/(t - 1). Using the given value X(0.1) 2 1.105 35, we have 
ti = 0.1, and X, 2 1.105 35. So the first step is 


t=t, +h=0.100 00 + 0.1 = 0.200 00 
ъ= Ху + IAGF, — fy) =X, + IA[3f(t, X) — f(t, Xo)] 


1 Xo 
1+1 t4+1 


1.10535° 1? 


-x, «is ) = 1.22196 
0.141 0+1 


) = 1.10535 + :01(3 


The results of the computation are shown in Figure 2.10. 





t x F, LAGE, — Fy) хі) Ix - X 
0.000 00 1.000 00 1.000 00 

0.10000 1.105 35 1.11073 0.11661 1.105 35 0.000 00 
0.200 00 1.22196 1.244 32 0.13111 1.22297 0.001 01 
0.300 00 1.35307 1.40831 0.149 03 1.35568 0.002 61 
0.400 00 1.502 10 1.61165 0.17133 1.507 10 0.004 99 
0.50000 1.673 44 1.866 92 0.199 46 1.68199 0.008 55 
0.600 00 1.872 89 2.192 33 0.235 50 1.88681 0.01391 
0.700 00 2.108 39 2.61490 0.282 62 2.13051 0.022 11 
0.800 00 2.39101 3.17608 0.345 67 2.425 93 0.034 92 
0.900 00 2.736 68 3.941 80 0.432 47 2.792 16 0.055 48 


1.000 00 3.169 14 3.258 89 0.089 75 
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Example 2.5 


Solution 


It is instructive to compare the values of X computed in Example 2.4 with those com- 
puted in Example 2.2. Since the method we are using here is a second-order method, 
the error at each step should be O(/?) rather than the O(/?) error of the Euler method. 
We are using the same step size as for the solution X, of Example 2.2, so the errors 
should be correspondingly smaller. Because in this case we know the exact solution of 
the differential equation, we can compute the errors. Examination of the results shows 
that they are indeed much smaller than those of the Euler method, and also considerably 
smaller than the errors in the Euler method solution X, which used step size h = 0.05, 
half the step size used here. 

In fact, some numerical experimentation (which we shall not describe in detail) 
reveals that to achieve a similarly low level of errors, the Euler method requires a step 
size h = 0.016, and therefore 63 steps are required to find the value of X(1). The second- 
order method of (2.12) requires only 10 steps to find X(1) to a similar accuracy. Thus 
the solution of a problem to a given accuracy using a second-order method can be 
achieved in a much shorter computer processing time than using a first-order method. 
When very large calculations are involved or simple calculations are repeated very 
many times, such savings are very important. 

How do we choose between methods of equal accuracy such as (2.11) and (2.12)? 
Numerical methods for the solution of differential equations have other properties 
apart from accuracy. One important property is stability. Some methods have the 
ability to introduce gross errors into the numerical approximation to the exact solu- 
tion of a problem. The sources of these gross errors are the so-called parasitic 
solutions of the numerical process, which do not correspond to solutions of the 
differential equation. The analysis of this behaviour is beyond the scope of this 
book, but methods that are susceptible to it are intrinsically less useful than those 
that are not. The method of (2.11) can show unstable behaviour, as demonstrated in 
Example 2.5. 


Let X, denote the approximation to the solution of the initial-value problem 


dX 3x 4264, x(0)=2 

dt 
obtained using the method defined by (2.11), and X, that obtained using the method 
defined by (2.12), both with step size Л = 0.1. Compute the values of X,(f) and X,(f) for 
t=0.1, 0.2, ..., 2.0. Compare these with the values of x(f), the exact solution of the 
problem. In order to overcome the difficulty of starting the processes, assume that the 
value X(0.1) = 1.645 66 has been obtained by another method. 


The exact solution of the problem, which is a linear equation and so may be solved by 
the integrating-factor method, is 


Xegfqg 


The numerical solutions X, and X, and their errors are shown in Figure 2.11. It can be 
seen that X, exhibits an unexpected oscillatory behaviour, leading to large errors in the 
solution. This is typical of the type of instability from which the scheme (2.11) and 
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Figure 2.11 
Computational results 
for Example 2.5. 





t X, X, x(t) x-X, x=% 
0.000 00 2.000 00 2.000 00 2.000 00 

0.100 00 1.645 66 1.645 66 1.645 66 0.000 00 0.000 00 
0.200 00 1.37454 1.376 56 1.367 54 —0.007 00 —0.009 02 
0.300 00 1.14842 1.159 09 1.14739 —0.001 04 —0.011 70 
0.400 00 0.98182 0.984 36 0.97151 —0.010 30 —0.012 84 
0.500 00 0.827 46 0.842 27 0.829 66 0.002 20 —0.012 61 
0.600 00 0.727 95 0.725 83 0.714 11 —0.013 84 —0.011 72 
0.700 00 0.61022 0.629 54 0.61904 0.008 83 —0.010 50 
0.800 00 0.560 45 0.549 22 0.540 05 —0.020 41 —0.009 17 
0.900 00 0.453 68 0.48164 0.473 78 0.020 10 —0.007 86 
1.000 00 0.450 88 0.424 32 0.417 67 —0.033 21 —0.006 66 
1.100 00 0.330 30 0.375 33 0.369 75 0.039 45 —0.005 58 
1.200 00 0.385 84 0.333 15 0.328 52 —0.057 33 —0.004 64 
1.300 00 0.21927 0.296 60 0.292 77 0.073 50 —0.003 83 
1.400 00 0.363 29 0.264 75 0.26159 —0.101 70 —0.003 15 
1.500 00 0.099 93 0.236 83 0.234 24 0.13431 —0.002 59 
1.600 00 0.392 59 0.21225 0.210 13 —0.182 46 —0.002 12 
1.700 00 —0.054 86 0.190 52 0.188 78 0.243 64 —0.001 73 
1.800 00 0.498 57 0.17124 0.169 82 —0.328 76 —0.001 42 
1.900 00 —0.287 88 0.154 08 0.15291 0.440 80 —0.001 16 
2.000 00 0.73113 0.138 77 0.13781 —0.593 32 —0.000 96 


those like it are known to suffer. The scheme defined by (2.11) is not unstable for all 
differential equations, but only for a certain class. The possibility of instability in 
numerical schemes is one that should always be borne in mind, and the intelligent user 
is always critical of the results of numerical work and alert for signs of this type of 
problem. 


In this section we have seen how, starting from the Taylor series for a function, 
schemes of a higher order of accuracy than Euler's method can be constructed. We have 
constructed two second-order schemes. The principle of this technique can be extended 
to produce schemes of yet higher orders. They will obviously introduce more values 
of X, or E, (Where m2 n —2,n —3,...). The scheme (2.12) is, in fact, a member of a 
family of schemes known as the Adams-Bashforth formulae. The first few members 
of this family are 


Хае Ен 

Жыл = m) 

ХЕХ РОЗЕ, 651 58.) 

Х,а = Х,+ Lh(55F,— 59F, , 3TF, , — 9F, 3) 


The formulae represent first-, second-, third- and fourth-order methods respectively. The 
first-order Adams—Bashforth formula is just the Euler method, the second-order 
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2.3.6 


one is the scheme we introduced as (2.12), while the third- and fourth-order formulae 
are extensions of the principle we have just introduced. Obviously all of these require 
special methods to start the process in the absence of values of X ,, F ,, X ;, F_, and 
so on. 

Some of the methods used by the standard MATLAB procedures for numerical 
solution of ODEs are based on more sophisticated versions of the multistep methods 
which we have just introduced. Multistep methods are particularly suitable for solving 
equations in which the derivative function, ft, x), is relatively computationally costly 
to evaluate. At each step a multistep methods can reuse the values of the function 
already computed at previous steps so the number of evaluations of the derivative 
function is reduced compared to some other methods. 


Local and global truncation errors 


In Section 2.3.2 we argued intuitively that, although the Euler method introduces an 
error O(/?) at each step, it yields an O(A) error in the value of the dependent variable 
corresponding to a given value of the independent variable. What is the equivalent 
result for the second-order methods we have introduced in Section 2.3.5? We shall 
answer this question with a slightly more general analysis that will also be useful to us 
in succeeding sections. 

First let us define two types of error. The local error in a method for integrating 
a differential equation is the error introduced at each step. Thus if the method is 
defined by 


Xu - g(h, fy, Xp bi X, sow. ) 
and analysis shows us that 
Xp = g(h, bis Xn» fa, Xil ез ) + O(h?*") 


then we say that the local error in the method is of order p + | or that the method is a 
pth-order method. 

The global error of an integration method is the error in the value of X(t) + a) 
obtained by using that method to advance the required number of steps from a known 
value of x(t)). Using a pth-order method, the first step introduces an error O(h?*'). The 
next step takes the approximation X, and derives an estimate X, of x, that introduces a 
further error O(h?*'). The number of steps needed to calculate the value X(t, + a) is а/л. 
Hence we have 


X(t) + a) = X(t) + a) + B O(h?*") 


Dividing a quantity that is O(h”) by h produces a quantity that is O(h"), so we must 
have 


X(t + a) = x(t + a) + O(h”) 


In other words, the global error produced by a method that has a local error O(h?*') 
is O(h”). As we saw in Example 2.2, halving the step size for a calculation using 
Euler’s method produces errors that are roughly half as big. This is consistent with 
the global error being O(A). Since the local error of the Euler method is O(h’), this is 
as we should expect. Let us now repeat Example 2.2 using the second-order Adams— 
Bashforth method, (2.12). 


2.3 


Example 2.6 


Solution 


Figure 2.12 
Computational results 
for Example 2.6. 
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Let X, denote the approximation to the solution of the initial-value problem 


dx _ х? 0)=1 
dí t+ ay 





obtained using the second-order Adams—Bashforth method with a step size h = 0.1, and 
X, that obtained using a step size of h = 0.05. Compute the values of X,(f) and _X,(¢) for 
t=0.1, 0.2,..., 1.0. Compare these values with the values of x(f), the exact solution 
of the problem. Compute the ratio of the errors in X, and X,. In order to start the process, 
assume that the values X(—0.1) — 0.904 68 and X(—0.05) = 0.951 21 have already been 
obtained by another method. 


The exact solution was given in Example 2.2. The numerical solutions X, and X, and 
their errors are shown in Figure 2.12. 

Because the method is second-order, we expect the global error to vary like h’. 
Theoretically, then, the error in the solution X, should be one-quarter that in_X,. We see 
that this expectation is approximately borne out in practice. 


X, 
t X, A KO) Ix - Xil х ХЫ Ix As] 





x 
[x - Xi| 
0.000 00 1.000 00 1.000 00 1.000 00 
0.100 00 1.104 53 1.10512 1.10535 0.000 82 0.000 23 0.28 
0.200 00 1.220 89 1.22239 1.22297 0.002 08 0.000 58 0.28 
0.300 00 1.351 76 1.354 59 1.35568 0.003 92 0.001 09 0.28 
0.400 00 1.50049 1.50525 1.50710 0.006 61 0.001 85 0.28 
0.500 00 1.67144 1.679 03 1.68199 0.010 55 0.002 96 0.28 
0.600 00 1.87040 1.88217 1.88681 0.01640 0.004 64 0.28 
0.700 00 2.105 25 2.12331 2.13051 0.02525 0.007 20 0.29 
0.800 00 2.387 00 2.414 70 2.425 93 0.038 93 0.01123 0.29 
0.900 00 2.73145 2.77440 2.792 16 0.060 70 0.017 76 0.29 
1.000 00 3.162 20 3.23007 3.258 89 0.096 70 0.028 82 0.30 


Just as previously we outlined how, for the Euler method, we could estimate from 
two solutions of the differential equation the step size that would suffice to compute a 
solution to any required accuracy, so we can do the same in a more general way. If we 
use a pth-order method to compute two estimates X,(f) + a) and X(t) + a) of x(t) + a) 
using step sizes h and th then, because the global error of the process is O(h’), we 
expect the error in X (f + a) to be roughly 2? times that in X,(¢) + a). Hence the error in 
X,(to + a) may be estimated to be 


|X, (to + a) - X,(to + a) | 
22-1 
If the desired error, which may be expressed in absolute terms or may be derived from 


a desired maximum percentage error, is € then the factor k, say, by which the error in 
X,(to + a) must be reduced is 


к — Аф +а)- Хы + а)| 
e(2^ - 1) 


136 NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS 


Example 2.7 


Solution 


2.3.7 


Since reducing the step size by a factor of q will, for a pth-order error, reduce the error 
by a factor of q^, the factor by which step size must be reduced in order to meet the 
error criterion is the pth root of k. The step size used to compute X, is ih, so finally we 
estimate the required step size as 


о = 
к ES (2.13) 


This technique of estimating the error in a numerical approximation of an unknown 
quantity by comparing two approximations of that unknown quantity whose order of 
accuracy is known is an example of the application of Richardson extrapolation. 


Estimate the step size required to compute an estimate of x(1) accurate to 2dp for the 
initial-value problem in Example 2.6 given the values X,(1) = 3.16220 and X,(1) = 
3.230 07 obtained using step sizes h = 0.1 and 0.05 respectively. 


For the result to be accurate to 2dp the error must be less than 0.005. The estimates 
X,(1) and X,(1) were obtained using a second-order process, so, applying (2.13), with 
== 0.005, 1л = 0.05 and p = 2, we have 


0.015 


ћ = 0.05 (то 
[3.162 20 - 3.230 07] 


1/2 
) = 0.0235 
In a real engineering problem what we would usually do is round this down to say 
0.02 and recompute X(1) using step sizes h = 0.04 and 0.02. These two new estimates 
of X(1) could then be used to estimate again the error in the value of X(1) and confirm 
that the desired error criterion had been met. 


More accurate solution methods: predictor-corrector 
methods 


In Section 2.3.5 we showed how the third term in the Taylor series expansion 
2 42 

х(‹+ h) 2x(t) - à Oo) 4 f dx 
dt 2! dt 

could be replaced by either x(t — 4) or (dx/dr)(t — /). These are not the only possibilities. 

By using appropriate Taylor series expansions, we could replace the term with other values 
of x(t) or dx/dt. For instance, expanding the function x(t — 2h) about x(t) gives rise to 


(t) + O(h’) (2.14) 


2 
2d'x 


— (4) + O(h’) (2.15) 
dt 


x(t— 2h) =x() — 2h (t) + 2h 
dt 
and eliminating the second-derivative term between (2.14) and (2.15) gives 


x(t + A) = 2 x(0) +i x(t 2h) +3h (0 + О@?) 
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which, in turn, would give rise to the integration scheme 
Xu = 2X, + IX, + 3 hF, 


Such a scheme, however, would not seem to offer any advantages to compensate for 
the added difficulties caused by a two-step scheme using non-consecutive values of X. 

The one alternative possibility that does offer some gains is using the value of 
(dx/df)(t -- ^). Writing the Taylor series expansion of (dx/df)(t + A) yields 


2 
Wt ny = By + h EECA) + 00) 
dt dt dt 
and eliminating the second derivative between this and (2.14) gives 
h| dx dx 3 
= -| = — h 2.1 
x(t 4 А) = х(ї) + JE + ar + » + OW) (2.16) 


leading to the integration scheme 
Xu =X, + SAU, FL) (2.17) 


This, like (2.11) and (2.12), is a second-order scheme. It has the problem that, in order 
to calculate X,,,,, the value of F,,, 1s needed, which, in its turn, requires that the value 
of X,,, be known. This seems to be a circular argument! 

One way to work around this problem and turn (2.17) into a usable scheme is to start 
by working out a rough value of X,,, use that to compute a value of F,,;, and then use 
(2.17) to compute a more accurate value of X,,,. Such a process can be derived as 
follows. We know that 


x(t-- 5) 2 x(D 4 h en + OUP) 
Let 
X(t+h) =x(t) +h = (0) (2.18) 


then 
x(t + h) =X(t + h) + O(h’) 

or, using the subscript notation defined above, 
хь = ы + ООР?) 

Thus 


dx, == 


dt — 
=f (tists fu * O(h’)) 


fü; Xn) 


= fltp n1) + OUP) af (tosis Spat) + OCF") 


=f (thts Fnu1) + OCA’) (2.19) 
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In the subscript notation (2.16) is 
Xa = Xn + 3 A Sn Xa) О, хь) + ОФ?) 
Substituting (2.19) into this gives 
хь х, + АС х) + Almas Snr) + O07) + OCF’) 
That is, 
х = х, + PACÉG X) fas £3) * O07) (2.20) 


Equation (2.20) together with (2.18) forms the basis of what is known as a predictor— 
corrector method, which is defined by the following scheme: 


(1) compute the ‘predicted’ value of X,,,, call it_X,,,, from 
Х,а = Х,+ А, Х,) (2.21а) 
(2) compute the ‘corrected’ value of X, from 
K EA O AE A) (2.21b) 
This predictor—corrector scheme, as demonstrated by (2.20), is a second-order method. 
It has the advantage over (2.11) and (2.12) of requiring only the value of X,, not X, , or 


F,, ,. On the other hand, each step requires two evaluations of the function f(t, x), and 
so the method is less efficient computationally. 


Example 2.8 — Solve the initial-value problem 


dx. x 
dż +1 





, x(0)21 


posed in Example 2.2 using the second-order predictor- corrector scheme with a step 
size h = 0.1. Compute the values of X(f) for t= 0.1, 0.2,..., 1.0 and compare them 
with the values of the exact solution x(f). 


Solution The exact solution was given in Example 2.2. In this example the initial value of t is 
0 and x(0) = 1. Using the standard notation we have f)= 0, and x)= x(f)) = x(0) = 1. 
The function f(t, x) = x?/(t+ 1). So the first two steps of the computation are thus 


2 2 
5 1 
Š, = xo + Mf (tos Xp) = Xo + B sisti = 1.100 00 








2 22 
Xo + ^x) 


X, = x0 + ALA, Xo) + f(t, Xi] = Xo + ii +1 +1 


2 2 
= 1,000 00 + ioi tn + 1.10000 - 


) = 1.105 00 
0+1 0.10000 +1 


Figure 2.13 


Computational results 


for Example 2.8. 
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X, =X, + ft, X)=X +h 


X 
t1 


1.105 00? 
0.100 00 +1 





= 1.105 00+ 0.1 = 1.216 00 


X% =X + SAt, X) * f(t, Xl 


2 02 
=җ+ (2ч. + 2) 
ftl +1 
2 2 
110500", 1.21600") _ 999 1 


= 1.105 00 + oa( 
2 0.100 00+ 1 0.20000 +1 


The complete computation is set out in Figure 2.13. 





t X, Fto Xa) Xa Fas X) х(д) Ix - X] 
0.000 00 1.000 00 1.000 00 1.10000 1.10000 1.000 00 0.000 00 
0.100 00 1.10500 1.11002 1.21600 1.23222 1.10535 0.000 35 
0.200 00 122211 1.244 63 1.346 58 1.394 82 1.222 97 0.000 86 
0.300 00 1.35408 1.41042 1.495 13 1.596 72 1.355 68 0.001 60 
0.400 00 1.50444 1.61667 1.66611 1.85061 1.507 10 0.002 65 
0.500 00 1.67781 1.876 69 1.865 47 2.175 00 1.68199 0.004 18 
0.600 00 1.880 39 2.209 92 2.10138 2.597 53 1.88681 0.006 42 
0.700 00 2.120 76 2.645 67 2.385 33 3.161 00 2.13051 0.009 75 
0.800 00 2.41110 3.229 66 2.73406 3.93426 2.425 93 0.014 83 
0.900 00 2.769 29 4.036 30 3.17292 5.033 72 2.79216 0.022 87 
1.000 00 3.222 79 3.258 89 0.036 10 


[5] Again the solution to this example can be obtained using MAPLE. The following 
worksheet computes the numerical and analytical solutions and compares them at 
the required points. 


> #set up differential equation 
еса sete) Л (ех) ашал а= (у= g 
> #obtain x1, the numerical solution 
» x1:-dsolve((deq1, initi), 
numeric,method-classical[heunform],output-listprocedure, 
stepsize-0.1); 

#xa is the analytic solution 
xa:=dsolve({deql, init1)); 
#compute values at required solution points 
ТОЕП: КЕЛП; ОПП © 

= 0166р (=) ехе (ор (а) ) ена do; 


М ММ 


Comparison of the result of Example 2.8 with those of Examples 2.2 and 2.6 shows 
that, as we should expect, the predictor- corrector scheme produces results of consider- 
ably higher accuracy than the Euler method and of comparable (though slightly better) 
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Example 2.9 


Solution 


Figure 2.14 
Computational results 
for Example 2.9. 


accuracy to the second-order Adams-Bashforth scheme. We also expect the scheme to 
have a global error O(h’), and, in the spirit of Examples 2.2 and 2.6, we confirm this in 
Example 2.9. 


Let X, denote the approximation to the solution of the initial-value problem 


2 
dx _ _х 


dt t+ 





x(0) = 1 


obtained using the second-order predictor—corrector method with a step size h = 0.1, and 
X, that obtained using // 2 0.05. Compute the values of X,(f) and X,(f) for t2 0.1, 0.2, ..., 
1.0. Compare these with the values of x(t), the exact solution of the problem. Compute 
the ratio of the errors in X, and X,. 


The numerical solutions X, and X, and their errors are shown in Figure 2.14. The ratio 
of the errors confirms that the error behaves roughly as O(h’). 





1 X, X, x(t) lx Xl i - X m 
|х 
0.000 00 1.000 00 1.000 00 1.000 00 
0.10000 1.10500 1.10526 1.10535 0.00035 0.000 09 0.27 
0.200 00 1.22211 1.22274 1.22297 0.000 86 0.000 23 0.27 
0.300 00 1.35408 1.35525 1.35568 0.001 60 0.000 43 0.27 
0.400 00 1.504 44 1.506 38 1.507 10 0.002 65 0.000 72 0.27 
0.500 00 1.677 81 1.680 86 1.68199 0.004 18 0.001 13 0.27 
0.600 00 1.88039 1.88507 1.88681 0.006 42 0.001 73 0.27 
0.700 00 2.12076 2.12787 2.13051 0.009 75 0.002 64 0.27 
0.800 00 2.41110 2.421 90 2.425 93 0.014 83 0.004 03 0.27 
0.900 00 2.769 29 2.78592 2.79216 0.02287 0.006 24 0.27 
1.000 00 3.22279 3.248 98 3.258 89 0.036 10 0.009 91 0.27 


In Section 2.3.5 we mentioned the difficulties that multistep methods introduce 
with respect to starting the computation. We now have a second-order method that 
does not need values of X, , or earlier. Obviously we can use this method just as 
it stands, but we then pay the penalty, in computer processing time, of the extra 
evaluation of f(t, x) at each step of the process. An alternative scheme is to use the 
second-order predictor—corrector for the first step and then, because the appropriate 
function values are now available, change to the second-order Adams—Bashforth 
scheme — or even, if the problem is one for which the scheme given by (2.11) (which 
is called the central difference scheme) is stable, to that process. In this way we create 
a hybrid process that retains the O(/) convergence and simultaneously minimizes the 
computational load. 

The principles by which we derive (2.16) and so the integration scheme (2.17) can 
be extended to produce higher-order schemes. Such schemes are called the Adams- 
Moulton formulae and are as follows: 
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2.3.8 


Figure 2.15 

A geometrical 
interpretation of 
the second-order 
predictor—corrector 
method. 


Xi =X, + hF p 


P =X, + IAE, F,) 


Xu =X, + S (SF, 3p E es deu) 


n п+1 


Xna =X, + Z AOF, + 19Е, – 5Е, 1 +Е,,) 


n 


These are first-, second-, third- and fourth-order formulae respectively. They are all like 
the one we derived in this section in that the value of F,,, is required in order to compute 
the value of X,,,. They are therefore usually used as corrector formulae in predictor— 
corrector schemes. The most common way to do this is to use the (p — 1)th-order 
Adams-Bashforth formula as predictor, with the pth-order Adams-Moulton formula 
as corrector. This combination can be shown to always produce a scheme of pth order. 
The predictor-corrector scheme we have derived in this section is of this form, with p = 2. 
Of course, for p > 2 the predictor—corrector formula produced is no longer self-starting, 
and other means have to be found to produce the first few values of X. We shall return to 
this topic in the next section. 

It may be noted that one of the alternative methods offered by MATLAB for the 
numerical solution of ODEs is based on the families of Adams—Bashforth and Adams— 
Moulton formulae. 


More accurate solution methods: Runge-Kutta methods 


Another class of higher-order methods comprises the Runge-Kutta methods. The math- 
ematical derivation of these methods is quite complicated and beyond the scope of this book. 
However, their general principle can be explained informally by a graphical argument. 
Figure 2.15 shows a geometrical interpretation of the second-order predictor-corrector 
method introduced in the last section. Starting at the point (¢,, X,,), point A in the diagram, 
the predicted value X, is calculated. The line AB has gradient f(¢,, X,), so the ordinate 
of the point B is the predicted value X,,,. The line AC in the diagram has gradient 
а, Х.л), Бе gradient of the direction field of the equation at point B, so point C 
has ordinate X, + h/(t,.,, X,,,). The midpoint of the line BC, point D, has ordinate X, + 


n 


SMFp X) +f (oars X.,,,)), which is the value of X,,, given by the corrector formula. 
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Figure 2.16 

A geometrical 
interpretation of 
the fourth-order 
Runge-Kutta 
method. 


Geometrically speaking, the predictor-corrector scheme can be viewed as the process 
of calculating the gradient of the direction field of the equation at points A and B and 
then assuming that the average gradient of the solution over the interval (t,, ¢,,,) is 
reasonably well estimated by the average of the gradients at these two points. The Euler 
method, of course, is equivalent to assuming that the gradient at point A is a good 
estimate of the average gradient of the solution over the interval (f,, ¢,,,). Given this 
insight, it is unsurprising that the error performance of the predictor-corrector method 
is superior to that of the Euler method. 

Runge-Kutta methods extend this principle by using the gradient at several points in 
the interval (¢,, 4,,;) to estimate the average gradient of the solution over the interval. 
The most commonly used Runge-Kutta method is a fourth-order one which can be 
expressed as follows: 


су = hf(t, X.) (2.222) 
c; 7 hf(t, * 5h, X, 1c) (2.22b) 
ESTATES Bee ad s 262) (2.22c) 

Cy = f(t, + h, X, * cx) (2.22d) 
ea (cue CM (2.22e) 


Geometrically, this can be understood as the process shown in Figure 2.16. The line AB 
has the same gradient as the equation’s direction field at point A. The ordinate of this 
line at г, + $/ defines point B. The line AC has gradient equal to the direction of the 
direction field at point B. This line defines point C. Finally, a line AD, with gradient 
equal to the direction of the direction field at point C, defines point D. The average 
gradient of the solution over the interval (t, £,,,) is then estimated from a weighted 
average of the gradients at points A, B, C and D. It is intuitively acceptable that such a 
process is likely to give a highly accurate estimate of the average gradient over the 
interval. 


(fas Xa) 


MI l(e, 42e, 26 4 c4) 


Xa + hf(t, * h, X, c3) 





X, € Bf(t, - Sh, X, 16) 
X, Aft, Sh X, ge) 


Xt Mf bu Xn) 
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Example 2.10 


Solution 


Figure 2.17 
Computational results 
for Example 2.10. 


As was said before, the mathematical proof that the process defined by (2.22a-e) is 
a fourth-order process is beyond the scope of this text. It is interesting to note that the 
predictor- corrector method defined by (2.21a, b) could also be expressed as 


с = hf th 20) 
Со hf(t, аг h, 26 ap сі) 


EXP S G + о) 


This is also of the form of a Runge-Kutta method (the second-order Runge-Kutta 
method), so we find that the second-order Runge-Kutta method and the second- 
order Adams-Bashforth/Adams-Moulton predictor-corrector are, in fact, equivalent 
processes. 


Let X, denote the approximation to the solution of the initial-value problem 


2 
dx _ _x 


dt t+l 





, x(0)21 


obtained using the fourth-order Runge-Kutta method with a step size h = 0.1, and X, 
that obtained using / 2 0.05. Compute the values of X,(f) and X,(f) for t 2 0.1, 0.2, ..., 
1.0. Compare these with the values of x(t), the exact solution of the problem. Compute 
the ratio of the errors in X, and X,. 


The exact solution was given in Example 2.2. The numerical solutions X, and X, and their 
errors are presented in Figure 2.17. 

This example shows, first, that the Runge-Kutta scheme, being a fourth-order scheme, 
has considerably smaller errors, in absolute terms, than any of the other methods we 
have met so far (note that Figure 2.17 does not give raw errors but errors times 1000!) and, 
second, that the expectation we have that the global error should be O(/^) is roughly 
borne out in practice (the ratio of |x — X;| to |x — X,| is roughly 16: 1). 





1 X, X, x(t) к-Х]х1#  х-лух# ET 
|x т х, | 
0.00000 1.0000000 1.0000000 1.0000000 
0.10000 11053507 1.1053512 111053512 0.00055 0.000 04 0.0682 
0.20000 1.2229733 12229745 12229746 0.00133 0.00009 0.0680 
0.30000 1.3556802 1.355685 1.355687 0.00246 0.00017 0.0679 
0.40000 1.070918 1.5070957 1.5070959 0.00410 0.00028 0.0678 
0.50000 1.6819805 1.6819866 1.6819871 0.00653 0.00044 0.0678 
0.60000 1.8867952 1.8868047 1.8868054 0.01020 0.00069 0.0677 
0.70000 2.1304915 2.1305064 2.1305074 0.01592 0.00108 0.0677 
0.80000 2.4259031 24259266 2.4259283 0.02519 0.00171 0.0677 
0.90000 2.7921155 2.7921537 2.7921565 0.04103 0.00278 0.0677 


1.00000 3.258 8214 3.258 8866 3.258 8914 0.069 94 0.004 74 0.0678 
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The table of values in Figure 2.17 can be obtained using MAPLE with the 
appropriate setting of the numerical method. The following worksheet computes 
the solutions specified and composes the required table. 


#set up differential equation 
Сі в іе: (ее) Е зс (ie D M tee ET ede OD) 
#obtain xl and x2, the numerical solutions 


1; 


MOMEN N 


x1:-dsolve((deg1, init1), numeric,method-classical[rk4], 
output-listprocedure,stepsize-0.1); 
» x2:-dsolve((deq1, initi),numeric,method-classical[rk4], 
output-listprocedure,stepsize-0.05); 
> #xa is the analytic solution 
» xa:-dsolve((deq1, init1)); 
ота ел ЕЕ 
ЕЕЕ У Е О Е ОЕ ОТЕ ОО БЕА 
БОЕ та: 
ГО OME te 10 ceo 
eas pe И: 
sexetl. aeo (2, db [E21 9 (Ge) e 
2) e exo (2, 52: 2:1 9 (ie) e 
Feels SEnreullie (Silos) (iE = il, yo) (2 , ei) )} )) S 
printet (mestr e, xxi x2, xa ав (Сеа) евр 
аре (0 веха) Шев a E a E E aE 
end do; 


It is interesting to note that the MAPLE results in the right-hand column, the ratio 
of the errors in the two numerical solutions, vary slightly from those in Figure 2.17. 
The results in Figure 2.17 were computed using the high-level programming language 
Pascal which uses a different representation of floating point numbers from that 
used by MAPLE. The variation in the results is an effect of the differing levels of 
precision in the two languages. The differences are, of course, small and do not 
change the overall message obtained from the figure. 


Runge-Kutta schemes are single-step methods in the sense that they only require the 
value of X,, not the value of X at any steps prior to that. They are therefore entirely self- 
starting, unlike the predictor—corrector and other multistep methods. On the other hand, 
Runge-Kutta methods proceed by effectively creating substeps within each step. There- 
fore they require more evaluations of the function f(t, x) at each step than multistep 
methods of equivalent order of accuracy. For this reason, they are computationally less 
efficient. Because they are self-starting, however, Runge-Kutta methods can be used 
to start the process for multistep methods. An example of an efficient scheme that 
consistently has a fourth-order local error is as follows. Start by taking two steps 
using the fourth-order Runge-Kutta method. At this point values of X, X, and X, are 
available, so, to achieve computational efficiency, change to the three-step fourth- 
order predictor—corrector consisting of the third-order Adams—Bashforth/fourth-order 
Adams—Moulton pair. 


10 
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2.3.9 Exercises 


(Note that Questions 8—15 may be attempted T 
using a hand-held calculator, particularly if it 

is of the programmable variety. The arithmetic 

will, however, be found to be tedious, and the 

use of computer assistance is recommended if 

the maximum benefit is to be obtained from 

completing these questions.) 


Using the second-order Adams—Bashforth 
method (start the process with a single step 
using the second-order predictor—corrector 
method), 


(a) compute an estimate of x(0.5) for the initial- 
value problem 


12 


E: =x’sint—x, x(0)= 0.2 


using step size h = 0.1; 
(b) compute an estimate of x(1.2) for the initial- 
value problem 
dx 2% 


== = е", 


0.5) = 0. 
T x(0.5)=0.5 


using step size / — 0.1. 


Using the third-order Adams-Bashforth method 
(start the process with two second-order 
predictor-corrector method steps) compute an 
estimate of x(0.5) for the initial-value problem 
dx 13 


T М2 + 20), х0) = 1 


using step size Л = 0.1. 


Using the second-order predictor—corrector method, 


(a) compute an estimate of x(0.5) for the initial- 
value problem 
dx 


— = (2t + x)sin2t, 


=0. 
di x(0) 5 


using step size Л = 0.05; 
(b) compute an estimate of x(1) for the initial-value 
problem 


dx l+x 


di maun O=? 


using step size Л = 0.1. 


Write down the first three terms of the Taylor series 
expansions of the functions 


dx dx 
—(t-h d —(t-2h 
di (= л) ап cr ( ) 
about x(t). Use these two equations to eliminate 
2 3 
Ч) апа Hy) 
dt dt 


from the Taylor series expansion of the function 
x(t +h) about x(t). Show that the resulting formula 
for x(t + h) is the third member of the Adams- 
Bashforth family, and hence confirm that this 
Adams-Bashforth method is a third-order method. 


Write down the first three terms of the Taylor series 
expansions of the functions 


dx dx 
—(t+h d =(t-h 
dt ( ш dt ( ) 
about x(t). Use these two equations to eliminate 
2 3 
Фу аа © 
dt dt 


from the Taylor series expansion of the function 
x(t + А) about x(t). Show that the resulting formula 
for x(t + h) is the third member of the Adams- 
Moulton family, and hence confirm that this 
Adams-Moulton method is a third-order method. 


Write down the first four terms of the Taylor series 
expansion ofthe function x(t — h) about x(f), and the 
first three terms of the expansion of the function 
dx 
—(t—h 
ao 
about x(t). Use these two equations to eliminate 
2 3 
Ч) апа Hoy 
dt dt 


from the Taylor series expansion of the function 
x(t + h) about x(t). Show that the resulting formula is 


X, = 4X, + 5X,_, + A(4F, + 2F,4) + O’) 
Show that this method is a linear combination of the 
second-order Adams—Bashforth method and the 
central difference method (that is, the scheme based 


on (2.9)). What do you think, in view of this, might 
be its disadvantages? 
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Using the third-order Adams-Bashforth-Moulton 
predictor—corrector method (that is, the second- 
order Adams-Bashforth formula as predictor and 
the third-order Adams-Moulton formula as 
corrector), compute an estimate of x(0.5) for 
the initial-value problem 

dx 2 2 = 
d -x tt, x(03)20.1 
using step size h = 0.05. (You will need to employ 
another method for the first step to start this scheme 
—use the fourth-order Runge-Kutta method). 


Using the fourth-order Runge—Kutta method, 


(a) compute an estimate of x(0.75) for the initial- 
value problem 


— х(0) = 1 
dt 


using step size Л = 0.15; 
(b) compute an estimate of x(2) for the initial-value 


problem 
d 1 
2 


using step size h=0.1. 


Consider the initial-value problem 


Фора x) =-1 

dt 

(a) Compute estimates of x(2) using the second- 
order Adams-Bashforth scheme (using the 
second-order predictor—corrector to start the 
computation) with step sizes A = 0.2 and 0.1. 
From these two estimates of x(2) estimate what 
step size would be needed to compute an 
estimate of x(2) accurate to 3dp. Compute X(2), 
first using your estimated step size and second 
using half your estimated step size. Does the 
required accuracy appear to have been achieved? 

(b) Compute estimates of x(2) using the second- 
order predictor- corrector scheme with step 
sizes h — 0.2 and 0.1. From these two estimates 


of x(2) estimate what step size would be 
needed with this scheme to compute an 
estimate of x(2) accurate to 3dp. Compute 
X(2), first using your estimated step size and 
second using half your estimated step size. 
Does the required accuracy appear to have 
been achieved? 

(c) Compute estimates of x(2) using the fourth- 
order Runge-Kutta scheme with step sizes 
h — 0.4 and 0.2. From these two estimates of 
x(2) estimate what step size would be needed to 
compute an estimate of x(2) accurate to 5 dp. 
Compute X(3), first using your estimated step 
size and second using half your estimated step 
size. Does the required accuracy appear to have 
been achieved? 


For the initial-value problem 


2 =e, x(1)=1 
find, by any method, an estimate, accurate to 5dp, of 
the value of x(3). 


Note: All of the exercises in this section can be 
completed by programming the algorithms in a 
high-level computer language such as Pascal, 

C and Java. Programming in a similar high-level 
style can be achieved using the language constructs 
embedded within the MATLAB and MAPLE 
packages. MAPLE, as we have already seen, 
and MATLAB also allow a higher-level style 

of programming using their built-in procedures 
for numerical solution of ODEs. Both MATLAB 
and MAPLE have very sophisticated built-in 
procedures, but MAPLE also allows the user 

to specify that it should use simpler algorithms 
(which it calls ‘classic’ algorithms). Amongst 
these simpler algorithms are many of the 
algorithms we discuss in this chapter. In the 
preceding exercise set, those which specify the 
Runge-Kutta method and the second-order 
predictor—corrector could be completed using 
MAPLE’s dsolve procedure specifying the 
relevant ‘classic’ solution methods. 
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Example 2.11 


Figure 2.18 

The analytical 
solutions of 
(2.23) and (2.24). 


Stiff equations 


There is a class of differential equations, known as stiff differential equations, that are 
apt to be somewhat troublesome to solve numerically. It is beyond the scope of this text 
to explore the topic of stiff equations in any great detail. It is, however, important to be 
aware of the possibility of difficulties from this source and to be able to recognize the 
sort of equations that are likely to be stiff. In that spirit we shall present a very informal 
treatment of stiff equations and the sort of troubles that they cause. Example 2.11 shows 
the sort of behaviour that 1s typical of stiff differential equations. 


The equation 


dx 


di =l-x, x(0)=2 (2.23) 
has analytical solution х = 1 + е“. The equation 

dx zi 

ac 50(1—x)4-50e*, x(0)22 (2.24) 


has analytical solution x = 1 i (50 e* — e??). The two solutions are shown in 
Figure 2.18. 

Suppose that it were not possible to solve the two equations analytically and 
that numerical solutions must be sought. The form of the two solutions shown in 
Figure 2.18 is not very different, and it might be supposed (at least naively) that the 
numerical solution of the two equations would present similar problems. This, however, 
is far from the case. 

Figure 2.19 shows the results of solving the two equations using the second-order 
predictor- corrector method with step size h = 0.01. The numerical and exact solutions 
of (2.23) are denoted by X, and x, respectively, and those of (2.24) by X, and x,. The 
third and fifth columns give the errors in the numerical solutions (compared with the 
exact solutions), and the last column gives the ratio of the errors. The solution X, is seen 
to be considerably more accurate than X, using the same step size. 
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Figure 2.19 
Computational results 
for Example 2.11; 

h — 0.01. 


Figure 2.20 
Computational results 
for Example 2.11; 

h = 0.025. 


Figure 2.21 
Computational results 
for Example 2.11; 

h = 0.05. 











t X, IX, — xl X, IX, — xy] Ratio of 
errors 
0.000 00 2.000 00 0.000 000 2.000 00 0.000 000 
0.100 00 1.904 84 0.000 002 1.923 15 0.000 017 11.264 68 
0.200 00 1.818 73 0.000 003 1.83547 0.000 028 10.022 19 
0.300 00 1.740 82 0.000 004 1.75596 0.000 026 6.86434 
0.400 00 1.670 32 0.000 005 1.684 02 0.000 023 5.15007 
0.500 00 1.606 54 0.000 005 1.618 93 0.000 021 4.12006 
0.600 00 1.548 82 0.000 006 1.560 03 0.000 019 3.43338 
0.700 00 1.496 59 0.000 006 1.506 74 0.000 017 2.942 90 
0.800 00 1.44934 0.000 006 1.458 51 0.000 016 2.57503 
0.900 00 1.406 58 0.000 006 1.414 88 0.000014 2.288 92 
1.000 00 1.367 89 0.000 006 1.375 40 0.000 013 2.060 02 
t X IX, — xal X, |X, — xy] Ratio of 
errors 
0.000 00 2.000 00 0.000 000 2.000 00 0.000 000 
0.10000 1.904 85 0.000 010 1.922 04 0.001 123 116.951 24 
0.200 00 1.818 75 0.000017 1.835 67 0.000 231 13.270 10 
0.300 00 1.74084 0.000 024 1.75625 0.000317 13.438 84 
0.400 00 1.67035 0.000 028 1.684 30 0.000 296 10.384 39 
0.500 00 1.606 56 0.000 032 1.619 18 0.000 268 8.328 98 
0.600 00 1.548 85 0.000 035 1.560 25 0.000 243 6.942 36 
0.700 00 1.496 62 0.000 037 1.506 94 0.000 220 5.950 68 
0.800 00 1.449 37 0.000 038 1.458 70 0.000 199 5.206 82 
0.900 00 1.406 61 0.000 039 1.415 05 0.000 180 4.628 26 
1.000 00 1.36792 0.000 039 1.375 55 0.000 163 4.165 42 
1 X, І = xil X, І = | 
0.000 00 2.000 00 0.000 000 2.000 00 0.000 000 
0.100 00 1.904 88 0.000 039 1.873 43 0.049 740 
0.200 00 1.81880 0.000 071 1.70736 0.128 075 
0.300 00 1.74091 0.000 096 1.42102 0.334914 
0.400 00 1.67044 0.000 116 0.802 59 0.881 408 
0.500 00 1.606 66 0.000 131 —0.705 87 2.324778 
0.600 00 1.548 95 0.000 142 —4.576 42 6.136 434 
0.700 00 1.496 74 0.000 150 —14.695 10 16.201818 
0.800 00 1.449 48 0.000 156 —41.322 43 42.780 932 
0.900 00 1.406 73 0.000 158 —111.551 73 112.966 595 
1.000 00 1.368 04 0.000 159 —296.925 40 298.300 783 


Figure 2.20 is similar to Figure 2.19, but with a step size h = 0.025. As we might 
expect, the error in the solution X, is larger by a factor of roughly six (the global error 
of the second-order predictor—corrector method is O(h’)). The errors in X,, however, 
are larger by more than the expected factor, as is evidenced by the increase in the ratio 


of the error in X, to that in_X,. 


Figure 2.21 shows the results obtained using a step size / — 0.05. The errors in X, 
are again larger by about the factor expected (25 when compared with Figure 2.19). The 
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solution X,, however, shows little relationship to the exact solution x, — so little that the 
error at t = 1 is over 20 00096 of the exact solution. Obviously a numerical method that 
causes such large errors to accumulate is not at all satisfactory. 

In Section 2.3.5 we met the idea that some numerical methods can, when applied to 
some classes of differential equation, show instability. What has happened here is, of 
course, that the predictor-corrector method is showing instability when used to solve 
(2.24) with a step size larger than some critical limit. Unfortunately the same behaviour 
is also manifest by the other methods that we have already come across — the problem 
lies with the equation (2.24), which is an example of a stiff differential equation. 


The typical pattern with stiff differential equations 1s that, in order to avoid instabil- 
ity, the step size used to solve the equation using normal numerical methods must be 
very small when compared with the interval over which the equation is to be solved. In 
other words, the number of steps to be taken is very large and the solution is costly in 
time and computing resources. Essentially, stiff equations are equations whose solution 
contains terms involving widely varying time scales. That (2.24) is of this type is evid- 
enced by the presence of terms in both e and e?" in the analytical solution. In order 
to solve such equations accurately, a step must be chosen that is small enough to cope 
with the shortest time scale. If the solution is required for times comparable to the long 
time scales, this can mean that very large numbers of steps are needed and the computer 
processing time needed to solve the problem becomes prohibitive. In Example 2.11 the 
time scale of the rapidly varying and the more slowly varying components of the solu- 
tion differed by only a factor of 50. It is not unusual, in the physical problems arising 
from engineering investigations, to find time scales differing by three or more orders 
of magnitude; that is, factors of 1000 or more. In these cases the problems caused 
are proportionately amplified. Fortunately a number of numerical methods that are 
particularly efficient at solving stiff differential equations have been developed. It is 
beyond the scope of this text to treat these in any detail. 

From the engineering point of view, the implication of the existence of stiff equations 
is that engineers must be aware of the possibility of meeting such equations and also of the 
nature of the difficulty for the numerical methods — the widely varying time scales inherent 
in the problem. It is probably easier to recognize that an engineering problem is likely to 
give rise to a stiff equation or equations because of the physical nature of the problem than 
it is to recognize a stiff equation in its abstract form isolated from the engineering con- 
text from which it arose. As is often the case, a judicious combination of mathematical 
reasoning and engineering intuition is more powerful than either approach in isolation. 

Both MAPLE and MATLAB feature procedures for the numerical solution of ODEs 
which are designed to deal efficiently with stiff equations. The user may be tempted to 
think that a simple way to negotiate the problem of stiff equations is to use the stiff equation 
solvers for all ordinary differential equations. However, the stiff equation methods are less 
computationally efficient for non-stiff equations so it is worth trying to identify which 
type of equation one is facing and using the most appropriate methods. 


Computer software libraries and the ‘state of the art’ 


In the last few sections we have built up some basic methods for the integration of first- 
order ordinary differential equations. These methods, particularly the more sophisticated 
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ones — the fourth-order Runge-Kutta and the predictor-corrector methods — suffice for 
many of the problems arising in engineering practice. However, for more demanding 
problems — demanding in terms of the scale of the problem or because the problem is 
characterized by ill behaviour of some form - there exist more sophisticated methods 
than those we are able to present in this book. 

All the methods that we have presented in the last few sections use a fixed step size. 
Among the more sophisticated methods to which we have just alluded are some that use 
a variable step size. In Section 2.3.6 we showed how Richardson extrapolation can be 
used to estimate the size of the error in a numerical solution and, furthermore, to estim- 
ate the step size that should be used in order to compute a solution of a differential 
equation to some desired accuracy. The principle of the variable-step methods is that a 
running check is kept of the estimated error in the solution being computed. The error 
may be estimated by a formula derived along principles similar to that of Richardson 
extrapolation. This running estimate of the error is used to predict, at any point in the 
computation, how large a step can be taken while still computing a solution within any 
given error bound specified by the user. The step size used in the solution can be altered 
accordingly. If the error is approaching the limits of what is acceptable then the step 
size can be reduced; if it is very much smaller than that which can be tolerated then the step 
size may be increased in the interests of speedy and efficient computing. For multistep 
methods the change of step size can lead to quite complicated formulae or procedures. 
As an alternative, or in addition, to a change of step size, changes can be made in the 
order of the integration formula used. When increased accuracy is required, instead 
of reducing the step size, the order of the integration method can be increased, and 
vice versa. Implementations of the best of these more sophisticated schemes are readily 
available in software packages, such as MAPLE and MATLAB, and software libraries 
such as the NAG library. 

The availability of complex and sophisticated ‘state of the art’ methods is not the 
only argument for the use of software packages and libraries. It is a good engineering 
principle that, if an engineer wishes to design and construct a reliable engineering artefact, 
tried and proven components of known reliability and performance characteristics 
should be used. This principle can also be extended to engineering software. It is almost 
always both more efficient, in terms of expenditure of time and intellectual energy, and 
more reliable, in terms of elimination of bugs and unwanted side-effects, to use soft- 
ware from a known and proven source than to write programs from scratch. 

For both of the foregoing reasons, when reliable mathematical packages, such as 
MAPLE and MATLAB, and software libraries are available, their use is strongly 
recommended. MAPLE is arguably the leading mathematical software package available 
today, offering both symbolic manipulation (computer algebra) and numerical problem 
solving across the whole span of mathematics. Amongst these, as we have already 
noted, MAPLE includes routines for the numerical solution of systems of ordinary 
differential equations. These routines are highly sophisticated, offering alternative 
methods suitable for stiff and non-stiff problems, using fixed time steps or variable time 
steps and optimized either for speed or for accuracy. The MATLAB package, with its 
Simulink continuous system modelling add-on, also offers sophisticated facilities for 
solving differential equations numerically. Again the package offers the choice of 
both fixed and variable time step methods, methods suitable for stiff problems as well 
as non-stiff ones, and a choice of optimizations aimed at either best speed or highest 
accuracy. Amongst the best known, and probably the most widely used, library of 
software procedures today is the NAG library. This library has a long history and has 
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2.4.1 


been compiled by leading experts in the field of numerical mathematics. Routines are 
available in a variety of programming languages. The routines provided for the solution 
of ordinary differential equations again encompass a variety of methods chosen to deal 
with stiff and non-stiff problems and to offer the user considerable flexibility in choice 
of method to suit every possible engineering requirement. By choosing an appropriate, 
high-quality software package or library the engineer can be assured that the imple- 
mentation will be, as far as possible, bug free, that the methods used will be efficient 
and reliable, and that the algorithms will have been chosen from the best ‘state of the 
art’ methods. 

It is tempting to believe that the use of software libraries solves all the problems of 
numerical analysis that an engineering user is likely to meet. Faced with a problem for 
which analytical methods fail, the engineer simply needs to thumb through the index to 
some numerical analysis software library until a method for solving the type of problem 
currently faced is found. Unfortunately such undiscerning use of packaged software 
will almost certainly, sooner or later, lead to a gross failure of some sort. If the user is 
fortunate, the software will be sophisticated enough to detect that the problem posed is 
outside its capabilities and to return an error message to that effect. If the user is less 
fortunate, the writer of the software will not have been able to foresee all the possible 
uses and misuses to which the software might be subjected and the software will not be 
proof against such use outside its range of applicability. In that case the software may 
produce seemingly valid answers while giving no indication of any potential problem. 
Under such circumstances the undiscerning user of engineering software is on the verge 
of committing a major engineering blunder. From such circumstances result failed 
bridges and crashed aircraft! It has been the objective of these sections on the numerical 
solution of differential equations both to equip readers with numerical methods suitable 
for the less demanding problems that will arise in their engineering careers and to give 
them sufficient understanding of the basics of this branch of numerical analysis that 
they may become discriminating, intelligent and wary users of packaged software and 
other aids to numerical computing. 


Numerical solution of second- and higher-order 
differential equations 


Obviously, the classes of second- and higher-order differential equations that can be 
solved analytically, while representing an important subset of the totality of such 
equations, are relatively restricted. Just as for first-order equations, those for which no 
analytical solution exists can still be solved by numerical means. The numerical solu- 
tion of second- and higher-order equations does not, in fact, need any significant new 
mathematical theory or technique. 


Numerical solution of coupled first-order equations 


In Section 2.3 we met various methods for the numerical solution of equations of the 
form 


E = 0, x) 
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that is, first-order differential equations involving a single dependent variable and a 
single independent variable. However it is possible to have sets of coupled first-order 
equations, each involving the same independent variable but with more than one 
dependent variable. An example of these types of equation is 


— M (2.252) 
dt 
2 Веі (2.25) 


This is a pair of differential equations in the dependent variables x and y with the inde- 
pendent variable t. The derivative of each of the dependent variables depends not only 
on itself and on the independent variable t, but also on the other dependent variable. 
Neither of the equations can be solved in isolation or independently of the other — both 
must be solved simultaneously, or side by side. A pair of coupled differential equations 
such as (2.25) may be characterized as 


£t - fit x.) (2.26a) 
t 

dy _ 

de - ft, x, y) (2.26b) 
For a set of p such equations it is convenient to denote the dependent variables not by 
X,y,z,... but by xy, xo, xs, ... , x, and the set of equations by 

dx 


de Ims x) (i21,2,..., p) 


or equivalently, using vector notation, 
d 
— [x] =f(t, x 
d at ]=f@ x) 


where x(f) is a vector function of t given by 


x(- pa) x) ... х0) 


f(t, x) 1s a vector-valued function of the scalar variable ¢ and the vector variable x. 
The Euler method for the solution of a single differential equation takes the 
form 
PTT = X, ар hf (th X,) 
If we were to try to apply this method to (2.26a), we should obtain 


Ха = X, + hfit,, 26. Y) 


In other words, the value of X,,, depends not only on ¢, and X, but also on Y,. In the same 
way, we would obtain 


Yai = Ү, + hfXt,, Xp Y) 


n 
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Example 2.12 


Solution 


for Y,,,,. In practice, this means that to solve two simultaneous differential equations, 
we must advance the solution of both equations simultaneously in the manner shown in 
Example 2.12. 


Find the value of X(1.4) satisfying the following initial-value problem: 


dz =x-y*+xt, x(1)=0.5 
ду =2х?+ху-1, у(1)=12 


using the Euler method with time step л = 0.1. 


The right-hand sides of the two equations will be denoted by fi(t, x, y) and f(t, x, y) 
respectively, so 


ft x, y) ox - y? * xt and ft, x, y) 22x? * xy - t 


The initial condition is imposed at t= 1, so ¢, will denote 1 + nh, X, will denote X(1 + nh), 
and Y, will denote Y(1 + nh). Then we have 


X, — xo t hfi(fo, Xo, yo) Y, — yo * hfo(fo, Xo, yo) 
= 0.5 + 0.1/(1, 0.5, 1.2) = 1.2+0.1/5(1, 0.5, 1.2) 
= 0.4560 = 1.2100 


for the first step. The next step is therefore 


X, X, hfi(t,, Xy, Yı) Y, = Y, + hfy(t, Xy, Yi) 
= 0.4560 = 1.2100 
+ 0.1/\(1.1, 0.4560, 1.2100) + 0.1/5(1.1, 0.4560, 1.2100) 
= 0.4054 = 1.1968 


and the third step is 


X; — 0.4054 Y, — 1.1968 
+ 0.1f7,(1.2, 0.4054, 1.1968) t 0.155(1.2, 0.4054, 1.1968) 
= 0.3513 = 1.1581 


Finally, we obtain 
X, = 0.3513  0.1//(1.3, 0.3513, 1.1581) 
= 0.2980 
Hence we have X(1.4) = 0.2980. 
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(m MAPLE’s dsolve procedure can find the numerical solution of sets of coupled 
ordinary differential equations as readily as for a single differential equation. The 
following worksheet finds the solution required in the example above. 


> #set up the two differential equations 
Exe Eon EEE NE ОЕ S Ey “ae 
Сеа? =апк (py (tes) pate) ees ( ao N EE: 
deqsystem:=deq1,deq2; 
> #set up the initial conditions 
= айа руз = (1) =@ Б-у (Шеше 
> dprocedure "dsolve" used to solve s system of two coupled 
differential equations 
> sol:=dsolve({deqsystem, inits}, numeric, 
method-classical[foreuler],output-listprocedure, 
stepsize-0.1); 
> #obtain numerical solution required 
S soxssoo (2, SOll [4 ]|)) NINE 


The principle of solving the two equations side by side extends in exactly the same 
way to the solution of more than two simultaneous equations and to the solution of 
simultaneous differential equations by methods other than the Euler method. 


Example 2.13 Find the value of X(1.4) satisfying the following initial-value problem: 


m =x-y*+xt, x(1)=0.5 


2 = 252 +ху- y(D-12 


using the second-order predictor—corrector method with time step л = 0.1. 


Solution First step: 


predictor 
X, 5o * hfi(fs, Xs, Yo) T, 5 ys * hf (fs, Xs, Yo) 
= 0.4560 = 1.2100 
corrector 
Xi =X + SALA (to, Xo, Yo) Ү, = уу+ ИДЕЛ? Xo, Yo) 
* fiti, Xi, YI + (А, Xi, Ў) 
= 0.5 + 0.05[ fi(1, 0.5, 1.2) = 1.2 + 0.05[ (1, 0.5, 1.2) 
t fi(1.1, 0.456, 1.21)] t fX(1.1, 0.456, 1.21)] 


= 0.4527 = 1.1984 
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Second step: 
predictor 
X, 2 X, * hft Xo Y) 
= 0.4042 
corrector 
X% =X; + ih[f(t,X,Y) 
* fit, X, 1] 
= 0.4527 
+ 0.05[f,(1.1, 0.4527, 1.1984) 
t fi(1.2, 0.4042, 1.1836)] 
= 0.4028 


Third step: 


predictor 
X, = Xo+ WG; Xa; Yo) 
= 0.3542 
corrector 
X, 2 X, * ALA, X, Yy) 
+ filts, X3, Y3)] 
= 0.4028 
+ 0.05[ f,(1.2, 0.4028, 1.1713) 
+ f,(1.3, 0.3542, 1.1309)] 
= 0.3553 


Fourth step: 


predictor 
Х,= + hfi(ts, X, Y;) 
= 0.3119 


corrector 
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Ў, = Y, + hf(t,, Xi, Y) 
= 1.1836 


Y, = Y, + AL Alt, X,Y) 
+ filb, X, Y;)] 
= 1.1984 
+ 0.05[5(1.1, 0.4527, 1.1984) 
+ 501.2, 0.4042, 1.1836)] 
= 1.1713 


f, = Y, + Ahh, X, Y;) 
= 1.1309 


Y, = +150, %, У) 
t filts, X, 1))] 
= 1.1713 
+ 0.05[5(1.2, 0.4028, 1.1713) 
+ 201.3, 0.3542, 1.1309) 
= 1.1186 


f, =Y; + hfr(ts, X3, Y3) 
= 1.0536 


х= + SAL A(t. Xs, Ys) + fi(t, Xa £4)] 
= 0.3553 + 0.05[ f(1.3, 0.3553, 1.1186) + (1.4, 0.3119, 1.0536)] 


Hence finally we have X(1.4) = 0.3155. 
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The MAPLE worksheet at the end of Example 2.12 can be easily modified to repro- 
duce the solution of Example 2.13 by changing the name of the required numerical 
method from foreuler to heunform. 


It should be obvious from Example 2.13 that the main drawback of extending 
the methods we already have at our disposal to sets of differential equations is the 
additional labour and tedium of the computations. Intrinsically, the computations are 
no more difficult, merely much more laborious — a prime example of a problem ripe 
for computerization. 


2.4.2 State-space representation of higher-order systems 


The solution of differential equation initial-value problems of order greater than one can 
be reduced to the solution of a set of first-order differential equations using the state-space 
representation introduced in Section 1.9. This is achieved by a simple transformation, 
illustrated by Example 2.14. 


Example 2.14 The initial-value problem 
dx 


2 
„ыз -xf =i 
dt dt 
can be transformed into two coupled first-order differential equations by introducing 
an additional variable 


2 


, x(0)212, (0) =0.8 
dt 


-dx 
dt 
With this definition, we have 
dx. dy 
df а 
and so the differential equation becomes 


d 2 
7 tty = ly 


Thus the original differential equation can be replaced by a pair of coupled first-order 
differential equations, together with initial conditions: 


dx 
= = у, 0) = 1.2 
di x(0) 


a --xn exit, y(0)-08 


This process can be extended to transform a pth-order initial-value problem into a 
set of p first-order equations, each with an initial condition. Once the original equation 
has been transformed in this way, its solution by numerical methods is just the same 
as if it had been a set of coupled equations in the first place. 
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Example 2.15 Find the value of X(0.2) satisfying the initial-value problem 
3 2 2 
dx aarda Py = 0, x00) =1, Beoy=05, (0) =-02 
dt dt dt dt dt 


using the fourth-order Runge-Kutta scheme with step size h = 0.05. 


Solution Since this is a third-order equation, we need to introduce two new variables: 


2 
y= dx апі 2= а= 
ағ dt dt 
Then the equation 1s transformed into a set of three first-order differential equations 
dx 
dc x(0) 
dy 
= 0 = 0.5 
dd (0) 


= - —xtz — ty t tx z(0) 2 -0.2 


Applied to the set of differential equations 


t - ft xs. 2) 
£t - fit x, y. 2) 


а х,у) 


the Runge-Kutta scheme is of the form 
cn = fin Xo Yn Za) 
С = Atn Xm Yn Zn) 
C31 = Atn Xm Yn Zn) 
Cy = hf, + ih, X, ец» Y 2 ib Z, ub 
Cn = (1, + ih, X, 7 Су A +20, 2, + 7 Сз\) 
€, = Aft, + ih, X, tieu Y, +30, Z, + ley) 
сз = hf (t, + Ih, X,+ sn Ү + 5652 Z; tle) 
Сз = (1, + Ih, X, TiO Y,+ ІС) Z, + Hon 
C33 = Aft, + ih, X, + Crp, Yt 5 Cops Z, + tcy) 
Ci = hfi, + h, Xr + Ciz, Yn + 05, „+ суу) 
Coq = (і, + Р, Х, + суз, Ү, + Сз, 2, + Сз) 


Су = РЬ(і, +, Х, + сз, У, + Сз, 2, + Саз) 
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Xi =X, + CR + 2с + 2613 + сц) 
Y, Y + (Co * 2€ + 203 + Сд) 
2м = 2, + 1 (сз + 203) + 2033 + C34) 
Note that each of the four substeps of the Runge-Kutta scheme must be carried out in 
parallel on each of the equations, since the intermediate values for all the independent 
variables are needed in the next substep for each variable; for instance, the computation 


of c,; requires not only the value of c,, but also the values of c;; and c,;. The first step of 
the computation in this case proceeds thus: 


Хо=ху= 1 Yy 2 y, 2 0.5 Zo = Za = —0.2 
cit = Af\(to, Xos Yo. 20) 
= АҮ, 
= 0.025 000 С = (1, Хо, Yo, Zo) 
= ЛАД, 
= –0.010 000 Сз = hfx(to, Xo, Yo, Zo) 
= h(—XotoZo — toYo + tXo) 
= 0.000 000 
с = hfilt + zh, Xo + 7 Су Yo 2 Сэ, Zo + Icy) 
= АҮ + 13) 
= 0.024 750 
Co = Af(ty + sh, Xy сь Ү + 2 Сэ, Zo + ley) 
= И) + ez) 
= —0.010 000 


Су = В + Fh, Xo + oy, Yo 1o, Zo 163) 
= A(X + eut * ER) + 4631) 
— (to + TRY 1e) * (fo + 3A QS 101) 
= —0.000 334 
C13 = Af\(ty + ih, Xo + бүз Yo e», Zot } C39) 
= МҮ + 1c5) 
= 0.024 750 
Сз = Afy(ty + ih, Xy len, Ү,+ le», Zo + + C39) 
= A(Z, + 1 Сз)) 
= —0.010 008 
C33 = Afy(ty + Ih, Xo + 5 61» Y, + 6n, Zt 1c) 
= (Ху + 1с) (6 + 10), + 13) 
– (6+) + 1с) + (fo + TAYQG 105) 
= — 0.000 334 
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Са = (+0, Хо + сз, Yo t Cis Zo F Csa) 
= АҮ + сз) 
= 0.024 499 
Сд = hfa(to + h, Хо + с\з, Ў + Сз, Zo + C33) 
= (20+ сэз) 
= 0.010016 
Са = (+ А, Хо + сз, Yot Cs Zo + C53) 
= h(-(X + €13)(ty + А)(2 + в) 
— (fg - hY(Y, + сз) + (6 + АРХ + сз) 
— —0.000 584 
X, = 1.024750, Y, = 0.489 994, Z, =—0.200 320 


The second and subsequent steps are similar — we shall not present the details of the 
computations. It should be obvious by now that computations like these are sufficiently 
tedious to justify the effort of writing a computer program to carry out the actual arith- 
metic. The essential point for the reader to grasp is not the mechanics, but rather the 
principle whereby methods for the solution of first-order differential equations can be 
extended to the solution of sets of equations and hence to higher-order equations. 


[=] Again MAPLE could be used to find the numerical solution of this set of coupled 
ordinary differential equations. However, the MAPLE dsolve procedure is also able 
to do the conversion of the higher-order equation into a set of first-order equations 
internally so the numerical solution of the example above using the fourth-order 
Runge-Kutta algorithm could be achieved with the following worksheet. 


> #set up the differential equation 
Саа) Еа ЕНЕ ИЕ) А) 

ВЕЕ ОЕ EE a = 
> #set up the initial conditions 
х и 9:5 0) = (ss) (O} 0.5, D(DGEs) ) (O)S-0 29 
> #procedure “dsolve” used to solve third order 

differential equations 
> sol:=dsolve({deq, inits), numeric,method-classical[rk4], 
output=listprocedure, stepsize=0.05); 

> #obtain the numerical solution required 
Foe oT PES IN EN Missa ORTOS eie (Oa) 
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18 


2.4.8 Exercises 


Transform the following initial-value problems 19 
into sets of first-order differential equations with 
appropriate initial conditions: 


2 
() C 6o? - 9 -axi- o 
dt dt 
x0-1, 8-2 20 
dx 


(b — 440? - 70)? -0 
dt 
=2 у 
x(1) = 2, 0) 0.5 


2 
() £x- sin( 2) +4х=0 
df dt 


-9. Z= 
х(0) =0, 20) =0 


3 2 
dx, PX, Git г 
df df dt 


2 
—xt-e 


(d) 
2 

x0)-1, 99-2, Š 0)=0 
dt dt 


3 2 
(e) 4х P84? = sing 
dt dt 


2 
x=1, ay=0, Ф®пу=-—2 
dt dt 





2l 
3 1/2 2 
o (E) ено 
dt dt 
_ 2 
х(2) = 0, dx (2) — 0, $02 
dt dt 
4 2 
(8) 4 rf8 47-1, xH=0, “@=0, 
df df dt 
2 3 
dž o=4, S0) =-3 
dt df 
4 3 
(h) dx ES (2 = г); + mr. on” 
dt dt df dt 
22 
=ť +4t-5 
dx dx dx = 
х(0)=а, = (0)=0 = O=b, = 0)=0 
dt dt dt 


Find the value of X(0.3) for the initial-value 
problem 


5 : 
CX. LL gp; x(0) = 0, d (92 1 
dt 


dt dt 


using the Euler method with step size h = 0.1. 


The second-order Adams—Bashforth method for 
the integration of a single first-order differential 
equation 


dx 

— -f(t, 

E = ft, x) 
is 

Xn =X + 5 ABS (t Xn) — /(%„ Х„)] 
Write down the appropriate equations for applying 


the same method to the solution of the pair of 
differential equations 


dx. dy _ 
dt =f X, y), dt = f(t, X, y) 


Hence find the value of X(0.3) for the initial-value 
problem 
2 

dx, POX 4 y= sin, x(0)=0, 2 ()=1 

dt dt dt 
using this Adams-Bashforth method with step size 
h — 0.1. Use the second-order predictor—corrector 
method for the first step to start the computation. 


Use the second-order predictor-corrector 

method (that is, the first-order Adams—Bashforth 
formula as predictor and the second-order 
Adams-Moulton formula as corrector) to compute 
an approximation X(0.65) to the solution x(0.65) 
of the initial-value problem 


3 2 2 
eo - of +[®) -x-0 
df dt dt 

z 
x0.5)=-1, €(o5-1, (0.5) =2 
dt dt 


using a step size h = 0.05. 


Write a computer program to solve the initial-value 
problem 
2 
dx, PA. y= sine, «09-0, € (9-1 
ағ dt dt 


23 
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using the fourth-order Runge-Kutta method. Use 
your program to find the value of X(1.6) using 
step sizes h = 0.4 and 0.2. Estimate the accuracy 
of your value of X(1.6) and estimate the step size 
that would be necessary to obtain a value of X(1.6) 
accurate to 6dp. 


Write a computer program to solve the initial- 
value problem 


3 2 2 
ita-nia (2) -x =0 
df df dt 


dx dix 
x(0.5) =-1, 051, 42005)=2 


2.4.4 


Boundary-value problems 


using the third-order predictor-corrector method 
(that is, the second-order Adams-Bashforth 
formula as predictor with the third-order Adams- 
Moulton as corrector). Use the fourth-order Runge- 
Kutta method to overcome the starting problem 
with this process. Use your program to find the 
value of X(2.2) using step sizes h = 0.1 and 0.05. 
Estimate the accuracy of your value of X(2.2) and 
estimate the step size that would be necessary to 
obtain a value of X(2.2) accurate to 6dp. 


Note: The comment on the use of high-level computer 
language and the MATLAB and MAPLE packages 
at the end of Section 2.3.9 is equally applicable to 

the immediately preceding exercises in this section. 


Because first-order ordinary differential equations only have one boundary condition, 
that condition can always be treated as an initial condition. Once we turn to second- and 
higher-order differential equations, there are, at least for fully determined problems, two 
or more boundary conditions. If the boundary conditions are all imposed at the same point 
then the problem is an initial-value problem and can be solved by the methods we have 
already described. The problems that have been used as illustrations in Sections 2.4.1 
and 2.4.2 were all initial-value problems. Boundary-value problems are somewhat more 
difficult to solve than initial-value problems. 

To illustrate the difficulties of boundary-value problems, let us consider second-order 
differential equations. These have two boundary conditions. If they are both imposed at the 
same point (and so are initial conditions), the conditions will usually be a value of the 
dependent variable and of its derivative, for instance a problem like 


Цх(д] = 0), x(a) =p, i (a) 2 q 


where L is some differential operator. Occasionally, a mixed boundary condition such as 


dos 
Cx(a) + DX (а)=р 


will arise. Provided that a second boundary condition on x or dx/dt is imposed at the same 
point, this causes no difficulty, since the boundary conditions can be decoupled, that is 
solved to give values of x(a) and (dx/df)(a), before the problem is solved. 

If the two boundary conditions are imposed at different points then they could con- 
sist of two values of the dependent variable, the value of the dependent variable at 
one boundary and its derivative at the other, or even linear combinations of the values 
of the dependent variable and its derivative. For instance, we may have 


L[x(t)] — ftt), 


or 


x(a) = p, 


x(b) = q 


LOI =f, d (a)=p, x(b)=q 
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Example 2.16 


Solution 


2.4.5 


ar 
Ц] = А0, ма) =р, 90) =4 

or even such systems as 
Lb()]-f(). ха) =р, Ах) + В) = 


The increased range of possibilities introduced by boundary-value problems almost 
inevitably increases the problems which may arise in their solution. For instance, it may at 
first sight seem that it should also be possible to solve problems with boundary conditions 
consisting of the derivative at both boundaries, such as 


Цх(д] = 50), (a) =p, О у= 


Things are unfortunately not that simple — as Example 2.16 shows. 


Solve the boundary-value problem 
dx dx dx 
x-4, €Xyzp, (1) = 
ae d ren 1 Dsg 


Integrating twice easily yields the general solution 
x=2t?+At+B 

The boundary conditions then impose 
A=p and 4+A=q 


It is obviously not possible to find a value of A satisfying both these equations unless 
q=p +4. In any event, whether or not p and q satisfy this relation, it is not possible to 
determine the constant B. 


Example 2.16 illustrates the fact that if derivative boundary conditions are to be 
applied, a supplementary compatibility condition is needed. In addition, there may be a 
residual uncertainty in the solution. The complete analysis of what types of boundary 
conditions are allowable for two-point boundary-value problems is beyond the scope of 
this book. Differential equations of orders higher than two increase the range of possi- 
bilities even further and introduce further complexities into the determination of what 
boundary conditions are allowable and valid. 


The method of shooting 


One obvious way of solving two-point boundary-value problems is a form of systematic 
trial and error in which the boundary-value problem is replaced by an initial-value 
problem with initial values given at one of the two boundary points. The initial-value 
problem can be solved by an appropriate numerical technique and the value of whatever 
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Figure 2.22 

The solution of a 
differential equation 
by the method of 
shooting: initial trials. 








х А 
X, (b) 
а т q 
p+ 
X, (b) 
t f > 
О а Ь t 


function is involved in the boundary condition at the second boundary point deter- 
mined. The initial values are then adjusted and another initial-value problem solved. 
This process is repeated until a solution is found with the appropriate value at the 
second boundary point. 

As an illustration, we shall consider a second-order boundary-value problem of 
the form 


L[]-f(). х(а) =р, x(b)-q (2.27) 


The related initial-value problem 


Lp]-f() x(a) =p, <a) =0 (2.28) 


could be solved as described in Section 2.4.2. Suppose that doing this results in an 
approximate solution of (2.28) denoted by _X,. In the same way, denote the solution of 
the problem 


d 
L[x] =f, x(a) =p, a (a)=1 (2.29) 
by X;. We now have a situation as shown in Figure 2.22. The values of the two solutions 
at the point t = b are X,(b) and X,(b). The original boundary-value problem (2.27) 
requires a value q at b. Since q is roughly three-quarters of the way between X,(b) and 
X,(b), we should intuitively expect that solving the initial-value problem 


Ib]. aay: (a) = 0.75 (2.30) 


will produce a solution with X(b) much closer to q. What we have done, of course, 
is to assume that X(b) varies continuously and roughly in proportion to (dx/dt)(a) 
and then to use linear interpolation to estimate a better value of (dx/df)(a). It is unlikely, 
of course, that X(b) will vary exactly linearly with (dx/df)(a) so the solution of (2.30), 
call it X5, will be something like that shown in Figure 2.23. The process of linear 
interpolation to estimate a value of (dx/df)(a) and the subsequent solution of the 
resulting initial-value problem can be repeated until a solution is found with a value 
of X(b) as close to q as may be required. This method of solution is known, by an 
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Figure 2.23 


The solution of a 
differential equation 
by the method of 


shooting: first 
refinement. 


2.4.6 
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obvious analogy with the bracketing method employed by artillerymen to find their 
targets, as the method of shooting. Shooting is not restricted to solving two-point 
boundary-value problems in which the two boundary values are values of the dependent 
variable. Problems involving boundary values on the derivatives can be solved in an 
analogous manner. 

The solution of a two-point boundary-value problem by the method of shooting 
involves repeatedly solving a similar initial-value problem. It is therefore obvious that 
the amount of computation required to obtain a solution to a two-point boundary- 
value problem by this method is certain to be an order of magnitude or more greater 
than that required to solve an initial-value problem of the same order to the same 
accuracy. The method for finding the solution that satisfies the boundary condition at 
the second boundary point which we have just described used linear interpolation. It is 
possible to reduce the computation required by using more sophisticated interpolation 
methods. For instance, a version of the method of shooting that utilizes Newton-Raphson 
iteration is described in R. D. Milne, Applied Functional Analysis, An Introductory 
Treatment (Pitman, London, 1979). 


Function approximation methods 


The method of shooting is not the only way of solving boundary-value problems numeric- 
ally. Other methods include various finite-difference techniques and a set of methods 
that can be collectively characterized as function approximation methods. In a finite- 
difference method the differential operator of the differential equation 1s replaced 
by a finite-difference approximation to the operator. This leads to a set of linear 
algebraic equations relating the values of the solution to the differential equation at 
a set of discrete values of the independent variable. Function approximation methods 
include various collocation methods and the finite-element method. In this section we 
shall very briefly outline function approximation methods and give an elementary 
example of the use of a collocation method. It is not appropriate to give an extensive 
treatment of these methods in this book; the reader needing more detail should refer to 
more advanced texts. 

The method of shooting solves a boundary-value problem by starting at one boundary 
and constructing an approximate solution to the problem step by step until the second 
boundary is reached. In contrast with this, function approximation methods find an 
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approximate solution by assuming a particular type or form of function for the solution 
over the whole range of the problem. This function (usually referred to as the trial 
function) is then substituted into the differential equation and its boundary conditions. 
Trial functions always contain some unknown parameters, and, once the function has 
been substituted into the differential equation, some criterion can be used to assign 
values to these initially unknown parameters in such a way as to make the trial function 
as close an approximation as possible to the solution of the boundary-value problem. 

Unless a very fortuitous choice of trial function is made, it is unlikely that it will 
be possible to make the function chosen satisfy the differential equation exactly. If, 
for instance, a trial function depending on some parameters p,, p, . .. and denoted 
by X(f; pj, pj, ... ) is to be used to obtain an approximate solution to the differential 
equation L[x(t)] = 0 then substituting this function into the differential equation results 
in a function 


LEXE; pi, p» ...)] 2 (6 Po. P» ---) 


which is called the residual of the equation. Intuitively, it seems likely that making this 
residual as small as possible will result in a good approximation to the solution of the 
equation. But what does making a function as small as possible mean? The most com- 
mon approaches are to make the residual zero at some discrete set of points distributed 
over the range of the independent variable — this gives rise to collocation methods — or 
to minimize, in some way, some measure of the overall size of the residual (for instance, 
the integral of the square of the residual) — this is commonly used in finite-element 
methods. 
Thus, for instance, to solve the boundary-value problem 


L[x()]20, x(a-24q, x(b)=r (2.31) 


we should assume that the trial function X(f), an approximation to x(t), takes some form 
such as 


х0 = У nfi) (2.32) 
i=1 
where {p;:i=1,2,...,n} is the set of parameters that are to be determined and { f(f) : 
i=1,2,...,n}1is some set of functions of t. Substituting the approximation (2.32) into 
the original problem (2.31) gives 


LIS pf) =) (2.33a) 
у pifí(a) ^q (2.33b) 
and 


У р.Б) = ғ (2.33c) 


izl 
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Equations (2.33b, c) express the requirement that the approximation chosen will satisfy 
the boundary conditions of the problem. The function 7(f) in (2.33a) is the residual of 
the problem. Since (2.33b, c) impose two conditions on the choice of the parameters p,, 


Pə, - - - p, We need another n — 2 conditions to determine all the p,. For a collocation 
solution this is done by choosing и – 2 values of t such that a « t, « tj « ... «t, «b 
and making 77(t,) = 0 fork = 1, 2,...,m— 2. Thus we have the n equations 
әл =0 (6S1,2,...,@=2) (2.34а) 
i=1 
У, ра) =4 (2.34) 
i=1 
>) Pi f(b) =r (2.34c) 
i=1 
for the n unknown parameters p4, P», . . . , p,. In general, these equations will be nonlinear 


in the p,, but if the operator L is a linear operator then they may be rewritten as 


Y pif] -0 (К 02, 2) (2.35a) 
5 Di fla) =4 (2.35b) 
DUET (2.35c) 


and are linear in the p;. They therefore constitute a matrix equation for the p;: 


Lift] МЛ] ... LESQ0] || pi 0 
LIfi()] ЦР)... ЦА) || р 0 
LIA()] ИРА)... ИА) | рз 0 
MI ; Ea ess | 2 (2.36) 
L[fí(t42)) L[6(5-2] ... ЦИЉ) |р 0 
fila) h(a) T f(a) fia q 
ЛО) fx(d) I ЛО) Pr P 


This matrix equation can, of course, be solved by any of the standard methods of linear 
algebra. If the operator L is nonlinear then (2.34) cannot be expressed in the form 
(2.35). The equations (2.34) may still be solved for the coefficients p, but the solution 
of nonlinear equations is, in general, a much more difficult task than the solution of 
linear ones. 

The choice of the functions f;(t) and the collocation points ¢, greatly affect the accur- 
acy and speed of convergence of the solution. (The speed of convergence in this context 
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Example 2.17 


Solution 


is usually measured by the number of terms it is necessary to take in the approxima- 
tion (2.32) in order to achieve a solution with a specified accuracy.) Example 2.17 shows 
a simple application of collocation methods to the solution of a second-order boundary- 
value problem. 


Solve the boundary-value problem 


dx «dx _ _ E 
She pum 0, x(0)=0, x2)=1 (2.37) 
dt dt 


using a collocation method with 


х= Ури 
t=] 


The differential operator in this case is linear, so we may construct the matrix equation 
equivalent to (2.36). With the given approximation, we have 
[i-DG-24te) «£10? (i3) 
L[f(Q)] 2 LI ^] 7 4e 4 1) (i=2) 
1 (i=1) 
We shall choose the collocation points to be equally spaced over the interior of the 


interval [0, 2]. Thus, for яп = 5 вау, we need three collocation points, which would be 
0.5, 1.0 and 1.5. We should therefore obtain the matrix equation 


L[f(O.5)) LLA(0.5)] LLA(0-5)] LLf(0.5)] LLfs(05)] || pi 0 
L[f(G.0) L[6(1.0) L[f(10)] L[f(.0)] L[f&.0)] || p» 0 
L[f(G.5)) LL6(LS)] L[AG.5)) LLAG.5)] L[AGOS5))] || p|7|0 
1 0 0 0 0 Da 0 
1 2 4 8 16 Ds 1 


Computing the numerical values of the matrix elements yields the matrix equation 


1.000 2.149 3.899 4.362 3.887 |р, 0.000 
1.000 3.718 8.437 15.155 23.873 | p, 0.000 
1.000 5.982 17.695 42.626 92.565 || p; | = | 0.000 
1.000 0.000 0.000 0.000 0.000 || p4 0.000 
1.000 2.000 4.000 8.000 16.000 |р; 1.000 


whose solution is 
р= [0.000 2.636 -1.912 0.402 0.010]' 


Figure 2.24 shows the solutions X4, X,, X; and X;. As we should intuitively expect, 
taking more terms in the approximation for x(t) causes the successive approximations 
to converge. In Figure 2.25 the approximations X; and X, are compared with a solution 
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Figure 2.24 A collocation solution of (2.37). 


Example 2.18 


Figure 2.25 Comparison of the collocation solutions 
with the solution by the method of shooting. 


to the problem (2.37) obtained by the method of shooting using a second-order Runge- 
Kutta integration method. The step size used for the method-of-shooting solution was 
estimated, using the technique introduced in Section 2.3.6, to yield a solution accurate 
to better than 3.5 x 102. On this graph the solution X; was indistinguishable from the 
method-of-shooting solution. 


Although Example 2.17 gave reasonably good accuracy from a relatively small 
number of terms in the function X,(?), difficulties do arise with collocation methods when 
straightforward power-series approximations like this are used. It is more normal to use 
some form of orthogonal polynomials, such as Tchebyshev or Legendre polynomials, 
for the f;(f). In appropriate cases f;(f) 7 sin it and cos it are also used. The reader is 
referred to more advanced texts for details of these functions and their use in collocation 
methods. 

Although they are rather more commonly used for problems involving partial dif- 
ferential equations, finite-element methods may also be used for ordinary differential 
equation boundary-value problems. The essential difference between finite-element 
methods and collocation methods of the type described in Example 2.17 lies in the type 
of functions used to approximate the dependent variable. Finite-element methods use 
functions with localized support. By this, we mean functions that are zero over large 
parts of the range of the independent variable and only have a non-zero value for some 
restricted part of the range. A complete approximation to the dependent variable may 
be constructed from a linear sum of such functions, the coefficients in the linear sum 
providing the parameters of the function approximation. 


A typical simple set of functions with localized support that are often used in the finite- 
element method are the ‘witch’s hat’ functions. For a one-dimensional boundary-value 
problem, such as (2.31), the range [a, b] of the independent variable is divided into a 
number of subranges [f, ¢,], [4, 61, --- >» [t,1, ¢,] with tp = a and t, = b. We then define 
functions 
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Figure 2.26 
The ‘witch’s hat’ 
functions. 


Figure 2.27 

The construction 
of a continuous 
piecewise-linear 
approximation 
function from 
*witch's hat 
functions. 
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The function f;(f) has support (that is, its value is non-zero) only on the interval [f,,, ¢,,,]. 
Figure 2.26 shows the form of the functions f(t). An approximation to the solution of a 
boundary-value problem can be formed as 


X(0 2 Y p. fict) (2.38) 


k=0 


This equation defines a function that is piecewise-linear and continuous on the range 
[a, b] as illustrated in Figure 2.27. 


The finite-element method provides a general framework for using functions with 
localized support to construct an approximation to the whole solution. One advant- 
age of using such functions is that the user can, to a considerable extent, tailor the 
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approximation used to the properties of the physical problem. If the problem is expected 
to give rise to very rapid changes in some region then more functions with local support 
in that area can be used. In regions where the solution is expected to change relatively 
slowly fewer functions may be used. In Figure 2.27, for instance, the division of the 
interval [a, b] into subregions is shown as being finer near t, and coarser near ¢,,. This 
property of functions with local support gives the finite-element method considerable 
advantages over collocation methods (which use functions defined over the whole 
range of the problem) and over finite-difference methods. 

Just as for the function approximation method illustrated in Example 2.17, the finite- 
element method requires that some criterion be chosen for determining the values of 
the unknown parameters in the approximation (2.38). A variety of criteria are commonly 
used, but we shall not describe these in detail in this section. The use of the finite-element 
method for obtaining numerical solutions of partial differential equations is described 
in Section 9.6. 


2.5 Engineering application: IBSIMTENDIERORENVSI UL 


The simple pendulum has been used for hundreds of years as a timing device. A 
pendulum clock, using either a falling weight or a clockwork spring device to provide 
motive power, relies on the natural periodic oscillations of a pendulum to ensure good 
timekeeping. Generally we assume that the period of a pendulum is constant regardless 
of its amplitude. But this is only true for infinitesimally small amplitude oscillations. In 
reality the period of a pendulum's oscillations depends on its amplitude. In this section 
we will use our knowledge of numerical analysis to assist in an investigation of this 
relationship. 

Figure 2.28 shows a simple rigid pendulum mounted on a frictionless pivot swinging 
in a single plane. By resolving forces in the tangential direction we have, following the 
classical analysis of such pendulums, 


2 


ma Е = -mg sin 
dt 








Figure 2.28 A simple 
pendulum. 


that is, 


2 
dO, E 0—0 (2.39) 
d? a 


For small oscillations of the pendulum we can use the approximation sin 0 ~ @ so the 
equation becomes 


2 
20.2020 (2.40) 
d? a 


which is, of course, the simple harmonic motion equation with solutions 


0-4 cos( Jr) e Bsin( Et) 
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Hence the period of the oscillations is 2r /(a/g) and is independent of the amplitude of 
the oscillations. 

In reality, of course, the amplitude of the oscillations may not be small enough for 
the linear approximation sin 0 « 0 to be valid, so it would be useful to be able to solve 
(2.39). Equation (2.39) is nonlinear so its solution is rather more problematical than 
(2.40). We will solve the equation numerically. In order to make the solution a little 
more transparent we will scale it so that the period of the oscillations of the linear 
approximation (2.40) is unity. This is achieved by setting t = 21 (a/g) v. Equation (2.39) 
then becomes 

2 
20 + 472 5іп Ө= 0 (2.41) 
ат 


For an initial amplitude of 30°, the pseudocode algorithm shown in Figure 2.29, which 
implements the fourth-order Runge—Kutta method described in Section 2.3.8, produces 
the results O(6.0) 2 23.965 834 using a time step of 0.05 and ©(6.0) = 24.018 659 with 
a step of 0.025. Using Richardson extrapolation (see Section 2.3.6) we can predict that 
the time step needed to achieve 5 dp of accuracy (i.e. an error less than 5 x 10 5) with 
this fourth-order method is 


1/4 


0.000 005 x (2* - 1) x 0.025 — 0.0049 
[23.965 834 - 24.018 659| | 


repeating the calculation with time steps 0.01 and 0.005 gives O(6.0) 2 24.021 872 7 and 
O(6.0) 2 24.021 948 1 for which Richardson extrapolation implies an error of 5 x 10? 
as predicted. 


These results could also have been obtained using MAPLE as shown by the follow- 
ing worksheet: 
Elec SIE tesa (ed ОЛ КЕЛЕЕ ке т scs (bis) ESO 
S dialies 3 =< (0) =60/ LEO" Px , 19 (9). (09) 0 
> sol:=dsolve({deqsys, inits}, numeric,method=classical 
[rk4],output-listprocedure,stepsize-0.005); 
Eur e PME OIN TN е (LO o FSI ces LO ШОШ ЕШ 5 


As a check we can draw the graph of |O,4,(7) — O55:(7)]/15, shown in Figure 2.30. 
This confirms that the error grows as the solution advances and that the maximum error 
is around 7.5 x 10°. 

What we actually wanted is an estimate of the period of the oscillations. The most 
satisfactory way to determine this is to find the interval between the times of successive 
zero crossings. The time of a zero crossing can be estimated by linear interpolation between 
the data points produced in numerical solution of the differential equation. At a zero 
crossing the successive values of © have the opposite sign. Figure 2.31 shows a modified 
version of the main part of the algorithm of Figure 2.29. This version determines the times 
of successive positive to negative zero crossings and the differences between them. 

Figure 2.32 shows some results from a program based on the algorithm of Figure 2.31; 
it is evident that the period has been determined to 6 sf accuracy. Figure 2.33 has been 
compiled from similar results for other amplitudes of oscillation. 
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Figure 2.29 

A pseudocode 101 < 0.00001 

algorithm for solving t start — 0 

the nonlinear pendulum t end — 6 

equation (2.41). write(vdu,'Enter amplitude — °) 


read(keyb, x0) 
x start «— pi*x0/180 
v. start — 0 
write(vdu,'Enter stepsize — ") 
read(keyb,h) 
write(vdu,t start, ",deg(x start)) 
t«— t start 
кек start 
у < v_start 
гереаї 
tk4(x,v,h — xn,vn) 
х < xn 
У < УП 
tc teh 
until abs(t — t end) « tol 
write(vdu,t, * ",deg(x)) 


procedure rk4(x,v,h — xn,vn) 

с11 < h«fl(x,v) 

с21 < h*f2(x,v) 

с12 < h*fl(x + с11/2,у + с21/2) 

с22 < h«f2(x * cll/2,v + c21/2) 

c13 < h*fl(x * c12/2,v + c22/2) 

c23 — h*f2(x + c12/2,v + c22/2) 

с14 < hsfl(x * c13,v 4 c23) 

c24 € h*f2(x + c13,v + c23) 

xn — x + (cll  2*(c12  c13) + с14)/6 

уп < у + (c21 + 2*(c22 + c23) + c24)/6 
endprocedure 









































procedure fl(x,v — fl) 
flv 
endprocedure 


procedure f2(x,v  f2) 
Р < —4*pi*pi*sin (x) 
endprocedure 


procedure deg(x — deg) 
deg — 180*x/pi 
endprocedure 


Some spring-powered pendulum clocks are observed to behave in a counter-intuitive 
way — as the spring winds down the clock gains time where most people intuitively 
expect it to run more slowly and hence lose time. Figure 2.33 explains this phenom- 
enon. The reason is that, in a spring-powered clock, the spring, acting through the 
escapement mechanism, exerts forces on the pendulum which, over each cycle of oscil- 
lation of the pendulum, result in the application of a tiny net impulse. The result is that 
just sufficient work is done on the pendulum to overcome the effects of bearing friction, 
air resistance and any other dissipative effects, and to keep the pendulum swinging with 
constant amplitude. But, as the spring unwinds the force available is reduced and the 
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Figure 2.30 8.0Е—6 
Error in solution 
of equation (2.41) 6.0E-6 Í 
using algorithm 
(2.30) with 
h = 0.005. ogee | | 
2.0E-6 LN 
=  0.0E+0 
a 0.00 0 2.00 3.00 4.00 5 one 
2.0E-6 
—4.0E-6 V 
-6.0E-6 y 
—8.0E-6 
Time 
Figure 2.31 tol — 0.00001 


Modification of 
pseudocode algorithm 
to find the period 

of oscillations of 
equation (2.41). 


t start — 0 
t_end < 6 
write(vdu, ‘Enter amplitude => ’) 
read(keyb,x0) 
X start «— pi*x0/180 
v. start €— 0 
write(vdu,'Enter stepsize — ") 
read(keyb,h) 
write(vdu,t start, ",deg(x start)) 
t«—t start 
х < x, start 
У < v. start 
t previous cross < t start 
repeat 
tk4(x,v,h — xn,vn) 
if(xn*x < 0) and (x > 0) then 
t cross € (t*xn — (t + h)*x)/(xn-x) 
write(vdu,t_cross,* ",t cross —t previous cross) 
t previous cross € t cross 
endif 
х= хп 
УСУ 
t tth 
until abs(t — t_end) < tol 


impulse gets smaller. The result is that, as the clock winds down, the amplitude of 
oscillation of the pendulum decreases slightly. Figure 2.33 shows that as the amplitude 
decreases the period also decreases. Since the period of the pendulum controls the 
speed of the clock, the clock runs faster as the period decreases! Of course, as the clock 
winds down even further, the spring reaches a point where it is no longer capable of 
applying a sufficient impulse to overcome the dissipative forces, the pendulum ceases 
swinging and the clock finally stops. 
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Figure2.32 Periodsof 
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Figure 2.33 12 
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[=] The periods of the oscillations can also be measured using MAPLE. The procedure 
fsolve finds numerically the roots of a function. The output of the procedure 
dsolve is a function so we can use fsolve to find the zeros of that function, as in 
the following MAPLE worksheet. Note that the period of successive cycles is found 
more accurately and consistently using MAPLE. This is because the procedure 
fsolve uses a higher-order method to locate the zeros of the function rather than 
the linear interpolation method outlined in the algorithm in Figure 2.31. 


оаа ewe 8s 

= fer i trem 1 te 6 clos 
ti:-fsolve(xx(t) (E (ab ream (esl te e 
t2=tsolve (xx (t) (ЕО) 
оа ОВЕ ВО ун О УЕ т ЕЕ 
end do; 


"BE ST TT TETTE. heating of an electrical fuse 


The electrical fuse is a simple device for protecting an electrical apparatus or circuit 
from overload and possible damage after the failure of one or more components in the 
apparatus. A fuse is usually a short length of thin wire through which the electrical current 


n 
EV 
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powering the apparatus flows. If the apparatus fails in such a way as to draw a dangerously 
increased current, the fuse wire heats up and eventually melts thus disconnecting the 
apparatus from the power source. In order to design fuses which will not fail during 
normal use but which will operate reliably and rapidly in abnormal circumstances we 
must understand the heating of a thin wire carrying an electrical current. 
The equation governing the heat generation and dissipation in a wire carrying an 
electrical current can be formulated as 
2 
-knr 12 4 onrn(r-7,)°=PZ (2.42) 
dx Tr 
where T is the temperature of the fuse wire, x is the distance along the wire, k is the 
thermal conductivity of the material of which the wire is composed, r is the radius of 
the wire, A is the convective heat transfer coefficient from the surface of the wire, T, is 
the ambient temperature of the fuse's surroundings, @ is an empirical constant with a 
value around 1.25, / is the current in the wire and p is the resistivity of the wire. Equa- 
tion (2.42) expresses the balance, in the steady state, between heat generation and heat 
loss. The first term of the equation represents the transfer of heat along the wire by 
conduction, the second term is the loss of heat from the surface of the wire by convec- 
tion and the third term is the generation of heat in the wire by the electrical current. 
Taking 0 = (T — T,) and dividing by knr’, (2.42) can be expressed as 
2 2 

c9 2h. pL (2.43) 

dx kr knr 
Letting the length of the fuse be 2a and scaling the space variable, x, by setting x = 2aX, 
(2.43) becomes 


2 2 2 2 
ө валс 441 
ах kr kr 
The boundary conditions are that the two ends of the wire, which are in contact with the 
electrical terminals in the fuse unit, are kept at some fixed temperature (we will assume 
that this temperature is the same as 7;). In addition, the fuse has symmetry about its 
midpoint x = a. Hence we may express the complete differential equation problem as 


2 2 2 2 
90 Sahg- N 6(0) = 0, 381 zd (2.44) 
dX kr krr dX 


Equation (2.44) is a nonlinear second-order ordinary differential equation. There is 
no straightforward analytical technique for tackling it so we must use numerical means. 
The problem is a boundary-value problem so we must use either the method of shooting 
or some function approximation method. Figure 2.34 shows a pseudocode algorithm for 
this problem and Figure 2.35 gives the supporting procedures. The procedure ‘desolve’ 
assumes initial conditions of the form 0(0) — 0, d0/d X(0) — 0, and solves the differential 
equation using the third-order predictor-corrector method (with a single fourth-order 
Runge-Kutta step to start the multistep process). The main program uses the method of 
regula falsa to iterate from two starting values of 0, which bracket that value of 0% 
corresponding to dO/dX(1) 2 0 which we seek. 

Figure 2.36 shows the result of computations using a program based on the algorithm 
in Figure 2.34. Taking the values of the physical constants as 2 100 W m°? K”, a = 0.01 m, 
k-63Wm'K,p-16x10*?Q mand r2 5 x 10* m, and taking / as 20 amps and 
40 amps, gives the lower and upper curves in Figure 2.36 respectively. 
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Figure 2.34 
Pseudocode algorithm 
for solving equation 
(2.44). 


rho — 16e-8 
Карра < 63 
г < 5е-4 
а < 1е-2 
hh < le2 
i< 20 
pconst ~ 8*hh*a*a/(kappar) 
qconst — 4«a*a*rho*i*i/(kappa*pi*pi*r+r#r+r) 
tol — le-5 
x start «— 0.0 
x end — 1.0 
theta start — 0.0 
write(vdu, ‘Enter stepsize -->’) 
read(keyb,h) 
write(vdu, ‘Enter lower limit -->’) 
read(keyb,theta_dash_low) 
write(vdu, ‘Enter upper limit -->’) 
read(keyb,theta_dash_high) 
desolve(x_start,x_end,h,theta_start,theta_dash_low — th,ql) 
desolve(x_start,x_end,h,theta_start,theta_dash_high — th,qh) 
repeat 
theta_dash_new < (qh*theta_dash_low — ql*theta_dash_high)/(qh — ql) 
desolve (x_start,x_end,h,theta_start,theta_dash_new — th,qn) 
if ql*qn>0 then 
gl — qn 
theta dash low «— theta dash new 
else 
qh — qn 
theta dash high « theta dash new 
endif 
until abs(qn) < tol 
write(vdu,th,qn) 


procedure desolve(x_0,x_end,h,v1_0,v2_0 — v1_f,v2_f) 
xe x0 
vl_o¢e vl_0 
v2_0¢ v2_0 
tk4(x,vl_o0,v2_0,h — vl,v2) 
x & x+h 
repeat 
pe3(x,v1l o,v2 o,vl,v2,h, — v1 n,v2 n) 
vl o — vl 
v2 0 — v2 
vl & vl_n 
v2 & v2_n 
х < x+h 
until abs(x — x end) < tol 
vl f — vl 
v2 fec v2 
endprocedure 


Evidently at 20 amps the operating temperature of the middle part of the wire is 
about 77? above the ambient temperature. If the current increases to 40 amps the 
temperature increases to about 245? above ambient — just above the melting point of tin! 
The procedure could obviously be used to design and validate appropriate dimensions 
(length and diameter) for fuses made from a variety of metals for a variety of applica- 


tions and rated currents. 
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Figure 2.35 
Subsidiary procedures 
for pseudocode 
algorithm for solving 
equation (2.44). 


procedure rk4 (x,vl,v2,h > vin,v2n) 
cll < hefl(x,vl,v2) 
с21 < h*f2(x,v1,v2) 
c12 € h*fl(x 4 h/2,v1 4 c11/2,v2 + с21/2) 
c22 < h«f2(x + h/2,v1 4 c11/2,v2 + с21/2) 
c13 € h*fl(x 4 h/2,v1 + c12/2,v2 + c22/2) 
c23 € h*f2(x 4 h/2,v1 + c12/2,v2 + c22/2) 
с14 < hefl(x +h,vl + c13,v2 + c23) 
с24 < h*f2(x t h,vl t c13,v2 + c23) 
vln & vl + (c11 + 2*(c12 + c13) + с14)/6 
у2п < v2 + (c21 + 2*(c22 + c23) + c24)/6 
endprocedure 


























procedure pc3(x, vl_o,v2_0,v1,v2,h — vl_n,v2_n) 
vl p < vl * h*(3*fl(x,v1,v2) — fl(x — h,vl. o,v2 o0))/2 
v2 p € v2 * h«(3*f2(x,v1,v2) — f2(x — h,vl. o,v2 0))/2 
vl n €— vl t h«(S5*fl(x + h,vl_p,v2_p) 
t 8*fl(x,v1,v2) — fl(x — h,vl. o,v2 0))/12 
v2 n € v2 t h«(5*f2(x - hvl. p, v2 p) 
+ 8*f2(x,vl,v2) — f2(x — h,vl_o,v2_0))/12 
endprocedure 








procedure f1(x, theta,theta_dash — f1) 
fl € theta dash; 
endprocedure 


procedure f2(x,thetatheta dash — f2) 
if theta — tol then 
f2 < -qconst 
else 
f2 € pconst*exp(In (theta)*1.25) — qconst 
endif 
endprocedure 





Figure 2.36 250 
Comparison of 
temperatures in a fuse 
wire carrying 20 amps 
and 40 amps. 
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(m The differential equation problem to be solved in this application is a boundary- 
value problem rather than an initial-value problem. MAPLE’s dsolve procedure 
can readily handle this type of problem. The following MAPLE worksheet repro- 
duces the temperature profiles shown in Figure 2.36. 


> deqsys:=diff (theta(x),x,x) -8*a*2*h/ 
(kini thieta(z) alpha: ао ЕУР отеу 
тшт еа о Обо (сеа) о 07; 
=a lpha 25 ае Оа По секао севе 
12= 20) 
> soll:=dsolve({deqsys, inits}, 
numeric, output=listprocedure,maxmesh=512) ; 
= dXse4ds 
> sol2:=dsolve({deqsys, inits}, 
numeric, output=listprocedure,maxmesh=512) ; 
EXE XOT Р Быш ш UN E ET o oT (E PES nl alee ШЕ 
ЕЛО ООО Soll |2))) ,co(2, sol2 (21) 1,0.) e 


To find a numerical solution of a second-order differential equation using 
MATLAB, the user must first carry out the transformation to a set of two first-order 
equations; MATLAB, unlike MAPLE, cannot complete this stage internally. Then 
the following MATLAB M-file solves the differential equation and reproduce the 
temperature profiles shown in Figure 2.36. 


function engineering_app2 

а= ОО = 100 k= 63)-e—5e—4) alliplia—t2 5) 1@— 1G e— 8 120), 
solinit = bvpinit(linspace(0,1,10),[40 0.51); 
soll = bvp4c(@odefun, @bcfun, solinit) ; 

BESTE 

sol2 = bvp4c(@odefun, @bcfun, solinit) ; 

x = spacen ombi 

wil = cleweill (S@ILIL хх); 

Уо = асела сото 9) 

КЫ бу ШЕ ТЕА 

y1(1,100) 

VAL LOO) 


function dydx = odefun(x,y) 
пун у )) 
Васа ча (Ей (О а ыска Онок ОЛ оа АВЕ: 
епа 
function res - bcfun(ya,yb) 
Ces E IN 
ION 
end 
end 
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2.7 Review exercises (1-12) 


i 


2 


3 


4 


5 


Find the value of X(0.5) for the initial-value 
problem 


dx 
ЕТО) 
m Ax, (0) 
using Euler's method with step size / — 0.1. 


Find the value of X(1.2) for the initial-value 
problem 
dx = xt 


erm х(1) =1 


using Euler's method with step size h = 0.05. 


Solve the differential equation 


s x(0) = 1 


dí Nr 


to find the value of X(0.4) using the Euler method 
with steps of size 0.1 and 0.05. By comparing the 
two estimates of x(0.4) estimate the accuracy of the 
better of the two values which you have obtained 
and also the step size you would need to use in order 
to calculate an estimate of x(0.4) accurate to two 
decimal places. 


Solve the differential equation 


dx : D ES 

qus sin (t^), x(0)=2 

to find the value of X(0.25) using the Euler method 
with steps of size 0.05 and 0.025. By comparing 
the two estimates of x(0.25) estimate the accuracy 
of the better of the two values which you have 
obtained and also the step size you would need to 
use in order to calculate an estimate of x(0.25) 
accurate to three decimal places. 


Let Xj, X; and X; denote the estimates of the 


E) function x(f) satisfying the differential equation 


dx 

= (xt, x(122 

T= \Gt+9, x1) 
which are calculated using the second-order 
predictor—corrector method with steps of 0.1, 0.05 
and 0.025 respectively. Compute X,(1.2), X,(1.2) 
and X;(1.2). Show that the ratio of |. X; — X;| and 


|X, — X,| should tend to 4: 1 as the step size 
tends to zero. Do your computations bear out 
this expectation? 


Compute the solution of the differential equation 


dx -t 

—-—e , x(0)-5 

dt © 

for x = 0 to 2 using the fourth-order Runge-Kutta 
method with step sizes of 0.2, 0.1 and 0.05. 
Estimate the accuracy of the most accurate of 
your three solutions. 


In a thick cylinder subjected to internal pressure 
the radial pressure p(r) at distance r from the axis 
of the cylinder is given by 


E 
dr 


where a is a constant (which depends on the 
geometry of the cylinder). 

If the stress has magnitude p, at the inner wall, 
г = го, and may be neglected at the outer wall, 
r — nr, show that 


ро 5 
NE 


a AN SIA 


If ry = 1, r, = 2 and py = 1, compare the value 
of p(1.5) obtained from this analytic solution 
with the numerical value obtained using the 
fourth-order Runge—Kutta method with step size 
h = 0.5. (Note: with these values of rọ, r, and po, 
а= –1/3). 


Find the values of X(f) for t up to 2 where X(f) 
is the solution of the differential equation 
problem 


3 с? 2 
а «a(&) - ix -sinf, 
dt dt dt 
2 
x1)-02, €) 21, 1) =0 
dt dt 


using the Euler method with steps of 0.025. 
Repeat the computation with a step size of 
0.0125. Hence estimate the accuracy of 

the value of X(2) given by your solution. 
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Find the solution of the differential equation problem Investigate the properties of the Van der 
В Pol oscillator. In particular show that the 
dx ЕЕ (x = pe + 40x =0, oscillator shows limit cycle behaviour (that 
dt is, the oscillations tend to a form which is 


independent of the initial conditions and depends 
only on the parameter u). Determine the 
dependence of the limit cycle period on Li. 


= dxo) = 
х(0) = 0.02, (0) =0 


using the second-order predictor-corrector 
method. Hence find an estimate of the value 12 


Extended, open-ended problem.) The equation 
of x(4) accurate to four decimal places. ( п n ) i 


2 of simple harmonic motion 


Find the solution of the differential equation problem d 
ORO 
3 J i dew He +2Vx=0 
с +а(®\ -к=зи, 
df [а dt 








is generally used to model the undamped 
oscillations of a mass supported on the end of 

a linear spring (that is, a spring whose tension is 
strictly proportional to its extension). Most real 
springs are actually nonlinear because as their 
extension or compression increases their 
stiffness changes. This can be modelled by 

the equation 


2 
х(1)= 1, ®ay=1, ay=2 
dt dt 


using the fourth-order Runge-Kutta method. 
Hence find an estimate of the value of x(2.5) 
accurate to four decimal places. 


(Extended, open-ended problem.) The second- 


order, nonlinear, ordinary differential equation dix б 
п ы Тл 
Е Desc зо 
dt For a ‘hard’ spring stiffness increases with 
governs the oscillations of the Van der Pol displacement (8 > 0) and a soft spring’s stiffness 
oscillator. By scaling the time variable the decreases ( B — 0). Investigate the oscillations 
equation can be reduced to of a mass supported by a hard or soft spring. In 
particular determine the connection between 
dx + U(x - pé t (2x)x20 the frequency of the oscillations and their 
2 


dt amplitude. 


\E 
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Figure 3.1 
Elementary 
vector algebra. 


eel 





Introduction 


In many applications we use functions of the space variable r = xi + yj + zk as models 
for quantities that vary from point to point in three-dimensional space. There are two types 
of such functions. There are scalar point functions, which model scalar quantities like 
the temperature at a point in a body, and vector point functions, which model vector 
quantities like the velocity of the flow at a point in a liquid. We can express this more 
formally in the following way. For each scalar point function f we have a rule, u = f(r), 
which assigns to each point with coordinate r in the domain of the function a unique 
real number w. For vector point functions the rule v = F(r) assigns to each r a unique vector 
v in the range of the function. Vector calculus was designed to measure the variation of 
such functions with respect to the space variable r. That development made use of the ideas 
about vectors (components, addition, subtraction, scalar and vector products) described 
in Chapter 4 of Modern Engineering Mathematics and summarized here in Figure 3.1. 


b -b 
b 
7 a+b a-b a 
addition subtraction 


components 


b 
0 


a 


a+b =|a\|b| cos 0 
scalar product 





c=axb 
lel = |a||b| sin 8 
vector product 


In component form if a = (a, a;, a,) and b = (b,, Б,, Ь,) then 
atb=(a,+b,, a+ by, a, +b) 


a:b = (ab; + ab, + a;b,;)=b-a 


i j k 
axb-ja, a, a4|-—bxa 
b, b, bs 


= (аз – заз, Баз – абз, a,b, — b,a;) 


Figure 3.2 
Level surfaces 


3.1.1 


of f(r) = (2, 2, —1) -r 


= 2х + 2у – 2. 
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The recent development of computer packages for the modelling of engineering 
problems involving vector quantities has relieved designers of much tedious analysis 
and computation. To be able to use those packages effectively, however, designers need 
a good understanding of the mathematical tools they bring to their tasks. It is on that 
basic understanding that this chapter focuses. 


Basic concepts 


cs 
8 


— 


We can picture a scalar point function f(r) by means of its level surfaces f(r) = constant. 
For example, the level surfaces of f(r) = 2x + 2y — z are planes parallel to the plane 
2 = 2х + 2y, as shown in Figure 3.2. On the level surface the function value does not 
change, so the rate of change of the function will be zero along any line drawn on the 
level surface. An alternative name for a scalar point function is scalar field. This is in 
contrast to the vector point function (or vector field). We picture a vector field by its 
field (or flow) lines. A field line is a curve in space represented by the position vector 
r(t) such that at each point of the curve its tangent is parallel to the vector field. Thus 
the field lines of F(r) are given by the differential equation 

dr _ F = 

— = F(r), where r(t) = ro 

dt 
and ry is the point on the line corresponding to ¢ = f). This vector equation represents 
the three simultaneous ordinary differential equations 


EE - PG. y. 2) 


t- Q(x, У, 2), 


ET — R(x, y, z) 
where F = (Р, О, К). 
Modern computer algebra packages make it easier to draw both the level surfaces of 
scalar functions and the field lines of vector functions, but to underline the basic ideas 
we shall consider two simple examples. 
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Example 3.1 


Solution 


Figure 3.3 (a) Level 
surfaces of f(r) =ze 
(b) field lines of 
F(r) = (-y, x, 1). 


ху. 


> 


Sketch 

(a) the level surfaces of the scalar point function f(r) 2 z e; 

(b) the field lines of the vector point function F(r) 2 (—y, x, 1). 

(a) Consider the level surface given by f(r) 2 c, where c is a number. Then 


ze™® = c and so z = ce”. For c, x and y all positive we can easily sketch part of 
the surface as shown in Figure 3.3(a), from which we can deduce the appearance 
of the whole family of level surfaces. 





(b) For the function F(r) 2 ( —y, x, 1) the field lines are given by 


dr 

— -(-y,xl 

qd 6595 

that is, by the simultaneous differential equations 


dx dy , dz. 
dt 


а dt 
The general solution of these simultaneous equations is 
x(t)=Acost+Bsint, y(t) 2 -Bcostt- Asint, z(t) 2t C 


where A, B and C are arbitrary constants. Considering, in particular, the field line 
that passes through (1, 0, 0), we determine the parametric equation 


(x(t), Y(£), z(t)) 7 (cos t, sint, 7) 


This represents a circular helix as shown in Figure 3.3(b), from which we can 
deduce the appearance of the whole family of flow lines. 
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In MATLAB a level surface may be drawn using the ezsurf function. Using the 
Symbolic Math Toolbox the commands: 

Өз э у ж © 

тош @ = M 2 3) 


Zn Can est (aN) 

еа (2, 0, 2, O, Bil) 
hold on 

end 


will produce three of the level surfaces of z = e™” on the same set of axes. The 
surfaces may also be produced in MAPLE using the ezsurf function. The field 
lines may be plotted in MATLAB using the streamline function. 


To investigate the properties of scalar and vector fields further we need to use the 
calculus of several variables. Here we shall describe the basic ideas and definitions 
needed for vector calculus. A fuller treatment is given in Chapter 9 of Modern Engineer- 
ing Mathematics. 

Given a function f(x) of a single variable x, we measure its rate of change (or 
gradient) by its derivative with respect to x. This is 


at — f(x) = рт Ах) = fix) 
X 


X) — fix 
Ax—0 Ах 


However, a function f(x, y, z) of three independent variables x, y and z does not have a 
unique rate of change. The value of the latter depends on the direction in which it is 
measured. The rate of change of the function f(x, y, z) in the x direction is given by its 
partial derivative with respect to x, namely 


of . lim £c Ax, y, z)— f(x, y, z) 
NG 


X Ax0 


This measures the rate of change of f(x, y, z) with respect to x when y and z are held 
constant. We can calculate such partial derivatives by differentiating f(x, y, z) with 
respect to x, treating y and z as constants. Similarly, 


Of _ lim 25 y t Ay, Z) - f(x, y, z) 


ду Ayo Лу 


апа 


of lim 22 y, Z Az) - f(x, y, z) 
Az 


Z  Az90 


define the partial derivatives of f(x, y, z) with respect to y and z respectively. 
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Example 3.2 


Solution 


For conciseness we sometimes use a suffix notation to denote partial derivatives, for 
example writing f, for 0f/dx. The rules for partial differentiation are essentially the 
same as for ordinary differentiation, but it must always be remembered which variables 
are being held constant. 

Higher-order partial derivatives may be defined in a similar manner, with, for 
example, 


ж 


ax ax\ ax 
Ht. a() =, 
dydx  ody\ax B 
әй ше, 
Odzdydx д>\дудх аа 


Find the first partial derivatives of the functions f(x, y, z) with formula (a) x + 2y + 2°, 
(b) х2(у + 22) апа (с) (х+ у)/(2? + х). 


(а) f(x y, z) 2 x * 2y + 2°. To obtain f, we differentiate f(x, y, z) with respect to x, 
keeping y and z constant. Thus f; — 1, since the derivative of a constant (2y + z*) 
with respect to x is zero. Similarly, f, = 2 and f, = 32°. 


(b) f(x, y, z) 2 x(y + 22). Неге we use the same idea: when we differentiate with 
respect to one variable, we treat the other two as constants. Thus 


2 [xy -* 22) 2 (y * a2 (х2) = 2х(у + 22) 
дх дх 

2 [х2(у + 22)] ae (у+ 22) 2 x1) 2x? 
ду ду 


9 [xX(y + 22)] = x2 (у + 22) = x*(2) = 2х? 
oz oz 


(c) fæ, y, Z) = x + yy (z? + х). Here we use the same idea, together with basic rules 
from ordinary differentiation: 


of (1 + х) - (х+у)(1) 


(quotient rule) 


ox (P xy 
sf cy 
(z +x) 
Ffl 
ду +x 
əf _ -3z (x y) 


(chain rule) 
oz (z xy 


Example 3.3 


Solution 
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The partial derivatives f', and f, of the function f(x, y), with respect to x and y respec- 
tively, are given by the commands 


MATLAB MAPLE 

syms X y 

CE T) ЕЛ) 
Е) Еч) 6° 
О) БУЛК = КОЕ УЕ; 


These commands can readily be extended to functions of more than two variables. 
Also second-order partial derivatives can be obtained by suitably differentiating the 
first-order partial derivatives already found. Thus in MATLAB the second-order 
partial derivatives of f(x, y) are given by 


[ze E се) ПОЧЕ Е АЗА У 
курс к= ЕЕ Ел с) 


Alternatively, the non-mixed derivatives can be obtained directly using the 
commands 


[xl EM Me Eins MP) 


which can be extended to higher-order partial derivatives. The corresponding com- 
mands in MAPLE are 


farce =a © ste d c Ms Ко т ку; 


ОЕ) 


In Example 3.2 we used the chain (or composite-function) rule of ordinary 
differentiation 


df . df du 
dx dudx 


to obtain the partial derivative Of/dz. The multivariable calculus form of the chain rule 
is a little more complicated. If the variables u, v and w are defined in terms of x, y and 
z then the partial derivative of f(u, v, w) with respect to x is 


af Adu , fw, af dw 
дх ди дх vox ðw dx 


with similar expressions for Of/dy and Of/dz. 


Find o7/or and OT/0@ when 
T(x, y) 2x! - xy c y? 
and 


x=rcos@ and y-rsinO 


By the chain rule, 


ar _ ӘТ Әх, AT ay 
dr oxor dyor 
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In this example 


TT =3х'-у апа a e» 


and 


2t — cose and 2 = sin 


so that 


a = (3x* — y)cos 6 + (—x + 3y*)sin 6 
E 


Substituting for x and y in terms of r and 0 gives 


oT _ 3r?(cos*@ + sin°@) — 2rcos @ sin 0 


or 
Similarly, 

oT = 

90 


— 3r*(sin 0 — cos 0)cos 0 sinO — r"(sin^0 — cos?) 


(3x? — y)(-rsin@) + (-x + 3y’)rcos@ 


Example 3.4 . FinddH/dt when 
A(t) = sin(3x – у) 
and 


х=22-3 апі у= 302 -– 5+1 


Solution We note that x and y are functions of f only, so that the chain rule becomes 


dH _ dH dx , JH dy 
dt oxdt дуй 


Note the mixture of partial and ordinary derivatives. H is a function of the one variable 
t, but its dependence is expressed through the two variables x and y. 
Substituting for the derivatives involved, we have 


a = 3[cos(3x — y)]47 — [cos (3x — y)(¢— 5) 


= (117+ 5)соѕ(3х — y) 
= (117+ 5)соѕ(502 + 52 – 10) 


Example 3.5 A scalar point function f(r) can be expressed in terms of rectangular cartesian coordin- 
ates (x, y, Z) or in terms of spherical polar coordinates (r, 0, 9), where 


x-rsin0cosó, y-rsinOsinó, z-rcosO 


Solution 





Figure 3.4 Spherical 
polar coordinates. 


Example 3.6 


Solution 
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as shown in Figure 3.4. Find Of/dx in terms of the partial derivatives of the function 
with respect to r, 0 and @. 


Using the chain rule, we have 


of _ of or , of 90 , of 9 
Ox дг ox 90 ax Oo Ox 
From Figure 3.4, r° = x? + y? + z’, tan@=y/x and tan@ = (x? + y)'7/z, so that 
or = Х = 5іп0 соѕф 
Ox т 
9ф _ 2( tan”) = y= _ sing 
ox ox x xy r sing 
д6 _ L an ag E 3» 
ox X 2 (х? +y +7 V(x “уу? 
_ cos ó cos 0 
r 
Thus 
of sin 0 cos ¢% - amey ооо 
r rsin Ob” r 90 


The Laplace equation in two dimensions is 

дх ду 
where x and y are rectangular cartesian coordinates. Show that expressed in polar co- 
ordinates (r, 0), where x 2 r cos Gand y 2 r sin, the Laplace equation may be written 


=0 





1 xe rs 1 д?и u 
ror 298 — 
Using the chain rule, we have 
ди _ дидх ү диду 
dr oxor ду a 
= О созӨ+ 95410 
апа 
2 
д cfs д 2—1 соз 294 и sin^0 4- 2 gu L sin Ө соз Ө 
or дк ay дхду 
Similarly 
Qu 


26 =H r sin 0) + Kcr cos 0) 
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and 
ди _ д?и : › ‚д°и 2 ди 2. 
— =—(-rsin 0) +—(rcos6) -2 тїп Ө соз Ө 
908 дк ду? дхду 
= dip cos 0) - 26 sin 0) 
so that 
2 2 2 2 
lou gu ѕіп20+ gu cos^8 - 2 ди sin 0 cos 0 
90° ox oy ax dy 
1( ди Ou. 
жая. Же |. «тени. T— 
(би cos 0 2 sin e) 
Hence 
2 2 2 2 
Lr +191 = as sin'0-- a cos’ 6- 22 » sin 0 cos 0 
rog ror x y хоу 
апа 
1u ү1ди уди _ ди уди 
ro ror or gy oy 
Since 


2(26) . £u ue 
or or’ or 


or 
we obtain the polar form of the Laplace equation in two dimensions 


19(.guY, lg*u . 
H3 a E r og zu 


The chain rule can be readily handled in both MATLAB and MAPLE. Considering 
Example 3.3, in MATLAB the solution may be developed as follows: 
The commands 


syms x y z theta 

TE E a NC К Есе ae) E UTR У 

se ex Acoste hetb y = ке еа 

Е ас rales eh =a e d c Ben E E E N 
ytheta - diff(y,theta); 

me S MU TESTS 


return 





Tr - (3*x^2 - y)*cos(theta) « (-x « 3*y^2)*sin(theta) 


To substitute for x and y in terms of x and theta we make use of the eval 
function, with 


eval(Tr) ; pretty (ans) 
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returning the answer 


(3r’cos (theta)? - rsin(theta))cos(theta) + 
(-rcos (theta) + 3r’sin(theta)?)sin(theta) 


which readily reduces to the answer given in the solution. 
Similarly the commands 


Ttheta = Tx*xtheta + Ty*ytheta; 
eval (Ttheta); pretty (ans) 


return the answer 
(-3r’cos(theta)* + rsin(theta))rsin(theta) + 
(-rcos(theta) + 3r’sin(theta)*)rcos (theta) 
which also reduces to the answer given in the solution. 
MAPLE solves this problem much more efficiently using the commands 


ПЕЕ M MM MM QC 
elem ad onerare ee (sintesi D а) 
ПШ ЕРШ (сос (Бека RES S sita tela ete Pte bre te QE. 





collect($,r); 
returning the answer 


(-3cos(0)?sin(0) « 3sin(0)?cos(0))r? 
* (sin(0)? - cos(0)?) z? 


3.1.2 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Sketch the contours (in two dimensions) of the 5 Find all the first and second partial derivatives of 
scalar functions the functions 

(а) fé, y) 2 InG? y? - 1) (a) fü) 2xvz-x!*y-z — (b) fi = xyz 
(b) f(x, y) 2 tan ![ y/(1 4 x)] (c) für) 2 z tan! (y/x) 

Sketch the flow lines (in two dimensions) of the 6 Find dffdt, where 


vector functions 


(a) FG, y) 2 yi * (6x* - 4j 


(a) für) 9x? - y? -z andx 2 P — 1, у= 21, 
2= 102-1) 


(b) F(x, y)=yit (GX? – х)ј 
where i and j are unit vectors in the direction of 
the x and y axes respectively. 7 Find 0f/ày and əf/ðz in terms of the partial 


Sketch the level surfaces of the functions 


(a) fr) =z- xy 


(b) f(r) = xyz, and x = e”sin £, y = e”cos t, z = t 


derivatives of f with respect to spherical polar 
coordinates (r, 0, 6) (see Example 3.5). 
(0) Хю) =2-е” 


8 Show that if u(r) = f(r), where r?° = x? + y? + 2°, as 


Sketch the field lines of the functions usual, and 


(a) F(r) = @y, ¥ +1,2) 


2 2 2 


(b) F(r) 2 (yz, zx, xy) dx ду az 
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then satisfies the differential equation 
Pf 28f o 2v ду _ду 
5 = 
dr ax ду д 


Hence find the general form for f(r). 


Show that 


10 Verify that V(x, y, z) = sin3x cos4y cosh5z satisfies 
the differential equation 


242 OV ƏV ƏV 
Vix, y, 2) = leg [E ) о а? 


ox ду дг 


3.1.3 Transformations 


Example 3.3 may be viewed as an example of transformation of coordinates. For 
example, consider the transformation or mapping from the (x, y) plane to the (s, f) 
plane defined by 


s—sQ,y) = х, у) (3.1) 
Then a function u = f(x, y) of x and y becomes a function u — F(s, t) of s and t under the 
transformation, and the partial derivatives are related by 

du _ duds ‚дид 

Ox дѕдх дідх 


(3.2) 
ди _ дид, диді 
dy sdy odtoy 
In matrix notation this becomes 
du) [2s at] | au 
Ox _ ox x| дѕ 6.3) 


Qu| |Os Qt| Qu 

ду ду ду дї 
The determinant of the matrix of the transformation is called the Jacobian of the trans- 
formation defined by (3.1) and is abbreviated to 


9s, 0 or simply to J 
d(x, y) 
so that 


ds ðt| |os 9s 
£ As, t) _ dx oOx| |óx oy (3.4) 
x,y) |25 | | ә | 
ду ду ox ду 


The matrix itself is referred to as the Jacobian matrix and is generally expressed in 


às as 
the form z . The Jacobian plays an important role in various applications of 
t t 


ox oy 
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mathematics in engineering, particularly in implementing changes in variables in multi- 
ple integrals, as considered later in this chapter. 

As indicated earlier, (3.1) define a transformation of the (x, y) plane to the (s, £) plane 
and give the coordinates of a point in the (s, £) plane corresponding to a point in the 
(x, y) plane. If we solve (3.1) for x and y, we obtain 


x — X(s, t), y — Y(s, f) (3.5) 


which represent a transformation of the (s, £) plane into the (x, y) plane. This is called 
the inverse transformation of the transformation defined by (3.1), and, analogously to 
(3.2), we can relate the partial derivatives by 


ди _ дидх , dudy 
os дхдз дуд» 


ди _ дидх ‚ диду 
Ot  OxOt Oyot 


(3.6) 


The Jacobian of the inverse transformation (3.5) 1s 


_ Ax, y) _ DX Ys 
‚= = 
As, t) X, y, 








where the suffix notation has been used to denote the partial derivatives. Provided 
J #0, it is always true that J, 2 J ! or 


Ax, y) As, 0) _ 1 
9(5, t) d(x, y) 


If J — 0 then the variables s and ¢ defined by (3.1) are functionally dependent; that is, a 
relationship of the form f(s, £ = 0 exists. This implies a non-unique correspondence 
between points in the (x, y) and (s, f) planes. 


If s = s(x, y), t 9 t(x, y) then using MuPAD in MATLAB the commands 


delete x, y: 
Пао jacelorem(((S, itll, se, wl) 


as as 
1 "RI dy 
return the Jacobian matrix 
at at 
ox oy 


The same result may be obtained with the Symbolic Math Toolbox using the 
commands 


Буше ху вс 
jecooiam (lar elai yi) 


or in MAPLE using the commands 


with(VectorCalculus): 
ture Tourer е tl, NT УЕ; 
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Example 3.7 


Solution 


Example 3.8 


Solution 


(a) Obtain the Jacobian J of the transformation 
S=2x+y, і=х- 2у 


(b) Determine the inverse transformation of the above transformation and obtain its 
Jacobian J,. Confirm that J; =J7. 


(a) Using (3.4), the Jacobian of the transformation is 
s-a l|. 
əx, y) |1 -2 


(b) Solving the pair of equations in the transformation for x and y gives the inverse 
transformation as 


х= 208+), у= (5-20 
The Jacobian of this inverse transformation is 


gue o(x, y) _ 
a(s, t) 





Wie wits 





1 
5 
2 
5 


confirming that J, = J”. 


Show that the variables x and y given by 


S+t stt 
v=; = —— 


г (3.7) 


are functionally dependent, and obtain the relationship f(x, y) = 0. 


The Jacobian of the transformation (3.7) is 


6 


1 
х, Ys t 
X, У, 


1 1 


S st st 
= 


O(s, t) 








ü Ihe & 
N 


Since J = 0, the variables x and y are functionally related. 
Rearranging (3.7), we have 


x=1+É, у= +1 
S t 


so that 


(к-1)(у-1)=+#=1 


11 


12 


13 


14 
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giving the functional relationship as 


ху = (х+у) = 0 


The definition of a Jacobian is not restricted to functions of two variables, and it is 
readily extendable to functions of many variables. For example, for functions of three 
variables, if 


и = Ux, y, 2), v= Vix, У, 2), w= W(x, y, z) (3.8) 


represents a transformation in three dimensions from the variables x, y, z to the variables 
u, v, w then the corresponding Jacobian is 


Qu. Ww». "EU MU 
u,v, Ww 
Je Ж) =lu, v, w,l=lv, v v 
ihe P 14 x y z 
OCA) 
ИО ERU wW Wy w: 


Again, if J — 0, it follows that there exists a functional relationship f(u, v, w) = 0 between 
the variables u, v and w defined by (3.8). 


3.1.4 Exercises 


Show that if x + y = u and y = uv, then 15 Find the value of the constant K for which 
Bes u = Кх + 4у +2? 
Xu, v) — v=3x+2y+z 
w = 2yz + 3zx + 6xy 
Show that ia Pepe ree eae a are functionally related, and obtain the 
then i : 
corresponding relation. 
Әб, у, 2) _ 2, 16 Show that, if u = g(x, y) and v = h(x, y), then 
Ou, v, w) 


vl X% 


If x =e" cosv and y = e” sinv, obtain the two ди ду дь ду 
Jacobians dy .. Ov jy dy _ ди p 
ди ox dv ox 
o(x, y Hu, v) where in each case 
Au, v) a(x, y) 
ae o(u, v) 
and verify that they are mutual inverses. a(x, y) 


Find the values of the constant parameter A for 
which the functions 


u = cosx cosy — Asinxsin y 
v = sinx cos y + Acos xsin y 


are functionally dependent. 


Use the results of Exercise 16 to obtain the partial 
derivatives 


du д> ди дь 
where 


u-e'cosy and v-e*siny 
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$15 


Figure 3.5 
Illustration of result 
(3.11). 


The total differential 


Consider a function u = f(x, y) of two variables x and y. Let Ax and Ay be increments 
in the values of x and y. Then the corresponding increment in u is given by 


Au = f(x + Ax, y + Ay) — f(x, y) 


We rewrite this as two terms: one showing the change in u due to the change in x, and 
the other showing the change in u due to the change in y. Thus 


Au = [fx F Ax, y T Ay) - f(x, yt Ay)] T [fay Б Лу) =f Os; »)] 
Dividing the first bracketed term by Ax and the second by Ay gives 


X f(x Ax, y t Ay) - f(x, y+ Ax Aus f(x, y+ Ay) - f(x, Y) Ay 
Ax Ay 
From the definition of the partial derivative, we may approximate this expression by 
Au = ЗА qF 9f Ay 
Ох ду 
We define the differential du by the equation 
аи = 9f Ax + бл» (3.9) 
ox oy 
By setting f(x, y) 7 fi(x, y) 2 x and f(x, y) = h(x, y) = y in turn in (3.9), we see that 


д д 


so that for the independent variables increments and differentials are equal. For the 
dependent variable we have 


dx = Of Ax d 2h Ay = Ах and dy = Ay 
X y 


du = OF ax aF d (3.10) 
ox oy 


We see that the differential du is an approximation to the change Au in u = f(x, y) 
resulting from small changes Ax and Ay in the independent variables x and y; that is, 


Au = du = Xjes Of ay = ax 4 9f Ay (3.11) 
Ox oy ox oy 


a result illustrated in Figure 3.5. 


(x + Ax, y + Ay, u + Au) 





Example 3.9 


Solution 
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This extends to functions of as many variables as we please, provided that the partial 
derivatives exist. For example, for a function of three variables (x, y, z) defined by 
u = f(x, y, Z) we have 


Е О аА 
Әх ду oz 
Ca ee ie a 

p QT EE 


The differential of a function of several variables is often called a total differential, 
emphasizing that it shows the variation of the function with respect to small changes in 
all the independent variables. 


Find the total differential of u(x, y) = x”. 


Taking partial derivatives we have 


си = ух! апі F =x" lnx 


Hence, using (3.10), 
du = yx’! dx + x” Inx dy 


Differentials sometimes arise naturally when modelling practical problems. When this 
occurs, it is often possible to analyse the problem further by testing to see if the expres- 
sion in which the differentials occur is a total differential. Consider the equation 


P(x, y) dx + Q(x, y)dy = 0 


connecting x, vy and their differentials. The left-hand side of this equation is said to be 
an exact differential if there is a function f(x, y) such that 


df= P(x, y) dx + Q(x, y)dy 
Now we know that 
di dg 9f ds 
ox oy 
so if f(x, y) exists then 
Рох, у)= 0 апі Об, у) = t 
Әх ду 
For functions with continuous second derivatives we have 


Of _ of 
OxOdy дудх 
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Example 3.10 


Solution 


Thus if f(x, y) exists then 


aP _ 20 р 
dy ox о 


This gives us a test for the existence of f(x, y), but does not tell us how to find 
it! The technique for finding f(x, y) is shown in Example 3.10. 


Show that 
(6х + 9у + 11)dx + (9x – 4у + 3)ау 
is an exact differential and find the relationship between y and x given 


dy _6x+9y+11 
dx 9x -4y4+3 


and the condition y = | when x = 0. 


In this example 
Р(х, у) = бх+9у+11 апа О(х,у)=9х-4у+3 
First we test whether the expression is an exact differential. In this example 


ӘР 
ду 


so from (3.12), we have an exact differential. Thus we know that there is a function 


=9 and 20 _g 
ox 


fx, y) such that 


а ен and D exped (3.13a, b) 
ox oy 


Integrating (3.13a) with respect to x, keeping y constant (that is, reversing the partial 
differentiation process), we have 


J x, у) = 3x? + Oxy + 11x + g(y) (3.14) 


Note that the ‘constant’ of integration is a function of y. You can check that this expression 
for f(x, y) is correct by differentiating it partially with respect to x. But we also know 
from (3.13b) the partial derivative of f(x, y) with respect to y, and this enables us to find 
g’(v). Differentiating (3.14) partially with respect to y and equating it to (3.13b), we have 


Г 944 08 05 443 
ду dy 


(Note that since g is a function of y only we use dg/dy rather than 0g/oy.) Thus 


dg _ 
dy 


SO, on integrating, 
80) 2 -2y! t 3y 4 C 
Substituting back into (3.13b) gives 


—4y+3 


/(х, у) = 3x? + Oxy + 11x -— 2y? + 3y +C 
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Now we are given that 


dy | 6x-9y-* ll 


dx 9х- 4у+3 
which implies that 

(6x + 9y + 11)dx + (9x — 4y + 3)dy = 0 
which in turn implies that 

3x? + 9ху+ 11х—2у*+3у+С=0 


The arbitrary constant C is fixed by applying the given condition y = 1 when x = 0, 
giving C = —1. Thus x and y satisfy the equation 


Зх? + 9ху + 11х—2у*+3у=1 


3.1.6 Ехегсіѕеѕ 


18 Determine which of the following are exact is the exact differential of a function f(x, y). Find the 
differentials of a function, and find, where corresponding function f(x, y) that also satisfies the 
appropriate, the corresponding function. condition f(0, 1) = 0. 

(а) (у + 2xy + D) dx € xy x?) dy 20 Show that the differential 
(b) (2xy? + 3y cos 3x) dx + (2x7y + sin 3x) dy 8(х, у) = (10x? + 6xy + 6’) dx 


+ (9x? + 4xy + 15у2) ду 


(с) (6xy — y*) dx + (2x e” — x”) dy 


is not exact, but that a constant m can be chosen so 


(d) (Z° —3y) dx + (12y? — 3x) dy + 3xz?dz that 


(2х + Зу)"е(х, у) 


19 Find the value of the constant A such that 


is equal to dz, the exact differential of a function 


(усоѕх + Acosy) dx + (xsiny + sinx + y) dy z = f(x, y). Find fx, y). 


Derivatives of a scalar point function 


3.2.1 


In many practical problems it is necessary to measure the rate of change of a scalar 
point function. For example, in heat transfer problems we need to know the rate of 
change of temperature from point to point, because that determines the rate at which 
heat flows. Similarly, if we are investigating the electric field due to static charges, 
we need to know the variation of the electric potential from point to point. To deter- 
mine such information, the ideas of calculus were extended to vector quantities. The 
first development of this was the concept of the gradient of a scalar point function. 


The gradient of a scalar point function 


We described in Section 3.1.1 how the gradient of a scalar field depended on the direc- 
tion along which its rate of change was measured. We now explore this idea further. 
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Consider the rate of change of the function f(r) at the point (x, y, z) in the direction of 
the unit vector (/, m, n). To find this, we need to evaluate the limit 


lim Arsan fn 


Ar—0 
where Ar is in the direction of (Z, m, n). In terms of coordinates, this means 


rtAr-r-c Ar(l,m, n) 
= (x + Ax, y+ Ay, z+ Az) 


so that 
Ax = lAr, Ay = mAr, Az =nAr 
Thus we have to consider the limit 


т tar, +mAr, z+nAr) - f(x, y,z 
pee Ar 


We can rewrite this as 


lim 
Ar>0 


3 lim = yc mAr, z x nr) - f(x, y, а], 


Ar>0 mâr 


[t +mAr, z+nAr) - f(x, y+mAr zd 
lAr 


deir = y, zt nAr) - f(x, y, a 


Ar—0 nAr 


Evaluating the limits, remembering that Ax = /Ar and so on, we find that the rate of 
change of f(r) in the direction of the unit vector (/, m, n) is 


21+ m + 9f = (5, of 9). 1 
a ee or ae ee 
The vector 
(8 9f of ) 
ox’ dy’ az 
is called the gradient of the scalar point function f(x, y, z), and is denoted by grad f or 
by Vf, where V is the vector operator 


sid iig i8 
V ix S thy 


where i, j and k are ће usual triad of unit vectors. 
The symbol V is called ‘del’ or sometimes ‘nabla’. Then 


yp oye a) аиа ата 
ү Doa туы | ду” A Sus) 


Thus we can calculate the rate of change of f(x, y, z) along any direction we please. If 
ú is the unit vector in that direction then 


(grad f) 


Figure 3.6 

(a) Adjacent level 
surfaces of f(r); 
(b) grad f acts 
normally to the 
surface f(r) = c. 


Example 3.11 


Solution 
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gives the required directional derivative, that is the rate of change of f(x, y, z) in the 
direction of û. Remembering that a - b = |a||b|cos@, where Ө is the angle between the 
two vectors, it follows that the rate of change of f(x, y, Z) is zero along directions per- 
pendicular to grad fand is maximum along the direction parallel to grad f. Furthermore, 
grad f acts along the normal direction to the level surface of f(x, y, z). We can see this 
by considering the level surfaces of the function corresponding to c and с + Ac, as 
shown in Figure 3.6(a). In going from P on the surface f(r) = c to any point Q on 
f(r) =c + Ac, the increase in fis the same whatever point Q is chosen, but the distance 
PQ will be smallest, and hence the rate of change of f(x, y, z) greatest, when Q lies on the 
normal fi to the surface at P. Thus grad fat P is in the direction of the outward normal 
fi to the surface f(r) = u, and represents in magnitude and direction the greatest rate of 
increase of f(x, y, z) with distance (Figure 3.6(b)). It is frequently written as 


grad f — 214 


where 9//Әп 15 геЃетгей to as the normal derivative to the surface f(r) = c. 





Vf 


f(r)2c-* ^c 





ГО) = с 


(а) (b) 


Find grad f for f(r) 2 3x? - 2y? + z* at the point (1, 2, 3). Hence calculate 


(a) the directional derivative of f(r) at (1, 2, 3) in the direction of the unit vector 
1 
3 (2, 2, 1); 
3 


(b) the maximum rate of change of the function at (1, 2, 3) and its direction. 


(a) Since df/dx = 6x, Of/dy = 4y and Of/dz = 2z, we have from (3.15) that 
grad f= Vf= 6xi + 4yj + 2zk 
At the point (1, 2, 3) 
grad f= 6i + 8j + 6k 
Thus the directional derivative of f(r) at (1, 2, 3) in the direction of the unit vector 
(3.303) is 
(6i 8j - 6k)- Gi e 2j 1k) - 5 
(b) The maximum rate of change of f(r) at (1, 2, 3) occurs along the direction parallel 


to grad f at (1, 2, 3); that is, parallel to (6, 8, 6). The unit vector in that direction 
is (3, 4, 3)//34 and the maximum rate of change of f(r) is | grad f | = 24/34. 
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Example 3.12 


Solution 





Figure 3.7 Tangent 
plane at (1, 3, 5) to the 
paraboloid 2z 2 x? 4 y?. 


If a surface in three dimensions is specified by the equation f(x, y, z) = c, or equival- 
ently f(r) = c, then grad fis a vector perpendicular to that surface. This enables us to 
calculate the normal vector at any point on the surface, and consequently to find the 
equation of the tangent plane at that point. 


A paraboloid of revolution has equation 2z = x? + y*. Find the unit normal vector to the 
surface at the point (1, 3, 5). Hence obtain the equation of the normal and the tangent 
plane to the surface at that point. 


A vector normal to the surface 2z — x? - y? is given by 

grad (x? + y? — 2z) = 2xi + 2yj - 2k 
At the point (1, 3, 5) the vector has the value 2i + 6j — 2k. Thus the normal unit vector 
at the point (1, 3, 5) is (+ 37 — k)//11. The equation of the line through (1, 3, 5) in the 
direction of this normal is 


Wot Jes Bes 
1 3 =] 


and the equation of the tangent plane is 


(Dx - 1) * (3(y - 3) * CD0(2-5) 20 
which simplifies to x + 3y — z = 5 (see Figure 3.7). 


The concept of the gradient of a scalar field occurs in many applications. The 
simplest, perhaps, is when f(r) represents the potential in an electric field due to static 
charges. Then the electric force is in the direction of the greatest decrease of the poten- 
tial. Its magnitude is equal to that rate of decrease, so that the force is given by —grad f. 


Using the Symbolic Math Toolbox in MATLAB the gradient grad f of the scalar func- 
tion f(x, y, z) is given by the grad function. For example, considering Example 3.11, 
the gradient of the scalar function f(x, y, z) 2 3x? - 2y? & z is given by the commands 
Swan Sca 
т к=к ЕДК оо ув 
Оа ЕВЕ ИЕ А Е ЕЕ T 
pretty (gradf) 
returning the answer 
[6x 4y 2z] 
Using MuPAD the answer is returned using the commands 
delete x, y, Z: 
ЕТО 28 paewl(395x^2 4 295929 dm^2,. [5 Wm rl) 
In MAPLE the answer is obtained using the commands 


with(VectorCalculus): 
oech = а Е (О) а ОИ а ж |) > 


21 


22 


28 


24 


25 


26 


2 
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3.2.2 Exercises 


Find grad f for f(r) 2 x?yz? at the point (1, 2, 3). 28 
Hence calculate 


(a) the directional derivative of f(r) at (1, 2, 3) 
in the direction of the vector (—2, 3, —6); 

(b) the maximum rate of change of the function at 
(1, 2, 3) and its direction. 

Find Vf where f(r) is 

(b) ztan'! (y/x) 

() eG? y) 


(d) xyzsin {n(x +y +z)} 


(а) x? +y’ -z 


30 


Find the directional derivative of f(r) = x? + y? — z 
at the point (1, 1, 2) in the direction of the vector 
(4, 4, -2). 


Find a unit normal to the surface xy? – 3xz = —5 at 
the point (1, —2, 3). 


If r is the usual position vector r = xi + yj + zk, with 
|r| 2 r, evaluate 


(а) Vr (Ы) v(1) 


If Vó 2 Qxy * zi + (x? +2) * (y * 2xz)k, find a 
possible value for @. 


Given the scalar function of position 

ф(х, у, Z) = x?y — 3xyz + z? 
find the value of grad @ at the point (3, 1, 2). Also 
find the directional derivative of @ at this point in 


the direction of the vector (3, —2, 6); that is, in the 
direction 3i — 2j + 6k. 


Find the angle between the surfaces x? - y? - z22 9 
and z = x? + y? — 3 at the point (2, —1, 2). 


Find the equations of the tangent plane and normal 
line to the surfaces 

(а) х2+2у2+ 322 = баі (1, 1, 1) 

(b) 2x? - y? - z22 3 at (1, 2, 3) 





(c) x! - y? -z 2 1 at (1, 2, 4). 


(Spherical polar coordinates) When a function f(r) 
is specified in polar coordinates, it is usual to 
express grad fin terms of the partial derivatives of f 
with respect to r, 0 and $ and the unit vectors u,, ug 
апа и, іп ће directions of increasing r, 0 and $ as 
shown in Figure 3.8. Working from first principles, 
show that 


E ed ua lur 1 of 
v= gady a rao" гып Ө дф "o 








Figure 3.8 Unit vectors associated with spherical 
polar coordinates. 


Derivatives of a vector point function 


When we come to consider the rate of change of a vector point function F(r), we see 
that there are two ways of combining the vector operator V with the vector F. Thus we 


have two cases to consider, namely 
V.F and VxF 


that is, the scalar product and vector product respectively. Both of these ‘derivatives’ 
have physical meanings, as we shall discover in the following sections. Roughly, if we 
picture a vector field as a fluid flow then at every point in the flow we need to measure 
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3.3.1 


Figure 3.9 Flow out 
of a cuboid. 


the rate at which the field is flowing away from that point and also the amount of spin 
possessed by the particles of the fluid at that point. The two 'derivatives' given formally 
above provide these measures. 


Divergence of a vector field 


Consider the steady motion of a fluid in a region R such that a particle of fluid instan- 
taneously at the point r with coordinates (x, y, z) has a velocity v(r) that is independent 
of time. To measure the flow away from this point in the fluid, we surround the point 
by an ‘elementary’ cuboid of side (2Ax) x (2Ay) x (2Az), as shown in Figure 3.9, and 
calculate the average flow out of the cuboid per unit volume. 


k- v(x, y, z+ Az) 


i. v(x — Ax, y, Z) 


i. v(x + Ax, y, Z) 





j- væ, y+ Ay, z) 
k - v(x, y, z— Az) 


The flow out of the cuboid is the sum of the flows across each of its six faces. 
Representing the velocity of the fluid at (x, y, z) by v, the flow out of the face ABCD is 
given approximately by 


i: v(x * Ax, y, Z(AAyAz) 
The flow out of the face A'B'C'D' is given approximately by 
—i: v(x — Ax, y, Z(AAyAz) 


There are similar expressions for the remaining four faces of the cuboid, so that the total 
flow out of the latter is 


i: [v(x * Ax, y, z) - v(x — Ax, y, z) (AAyAz) 
tj: [v(x, y * Ay, z) - v(x, y — Ay, z)( AAxAz) 
t k:[v(x, y, z - Az) - v(x, y, z - Az) 4AAxAy) 


Dividing by the volume 8AxAyAz, and proceeding to the limit as Ax, Ay, Az > 0, we 
see that the flow away from the point (x, y, z) per unit time is given by 


. Qv , , Qv дь 


reu qu 


Example 3.13 


Solution 
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This may be rewritten as 


2..9 2) 
i—+j—t+k—|-v 
( ox 79у dz 
or simply as V : v. Thus we see that the flow away from this point is given by the scalar 
product of the vector operator V with the velocity vector v. This is called the divergence 
of the vector v, and is written as div v. In terms of components, 


diva vine (22 а" K2) Шол) 
_ дһ др 
Eu UE 


(3.16) 


When v is specified in this way, it 1s easy to compute its divergence. Note that the 
divergence of a vector field is a scalar quantity. 


Find the divergence of the vector v = (2x — y*, 3z + x”, 4y — z?) at the point (1, 2, 3). 


Here v, 2 2x — y?, v = 3z + x° and v; = 4y – 22, so that 


dy o 2a 


- 3 
Әх , ду , oz á 


Thus from (3.16), at a general point (x, y, Z), 
divv = V:-v=2 -2z 

so that at the point (1, 2, 3) 
V-v=-4 


A more general way of defining the divergence of a vector field F(r) at the point r 
is to enclose the point in an elementary volume AV and find the flow or flux out of AV 
per unit volume. Thus 


CE о ошо 
AV>0 AV 


A non-zero divergence at a point in a fluid measures the rate, per unit volume, at which 
the fluid is flowing away from or towards that point. That implies that either the density 
of the fluid is changing at the point or there is a source or sink of fluid there. In the case 
of a non-material vector field, for example temperature gradient in heat transfer, a non- 
zero divergence indicates a point of generation or absorption. When the divergence is 
everywhere zero, the flow entering any element of the space is exactly balanced by the 
outflow. This implies that the lines of flow of the field F(r) where div F = 0 must either 
form closed curves or finish at boundaries or extend to infinity. Vectors satisfying this 
condition are sometimes termed solenoidal. 


206 VECTOR CALCULUS 


ЭЛ 


E» 


38 


34 


S5 


E] 


Using MuPAD in MATLAB the divergence of a vector field is given by the 
divergence function. For example, the divergence of the vector 


о= (2х- у, 32 +02, 4у – 22) 


considered in Example 3.13, is given by the commands 
delete x, y, z: 
Пааша -e divergence aa y A в алУ УКО 

[x, y, z]) 

which return the answer 

DELE 
In MAPLE the answer is returned using the commands 

with(VectorCalculus): 
SetCoordinates('cartesian' [ х, у, х]); 
Еее овет (ЕУ Оо) 
Divergence(F); or Del.F ; 


3.3.2 Exercises 


Find div v where 


F = (2x7y? « z^i * (3xy? - x?z)j * (Axy?z * xy)k 


(a) v(r) 2 3x?yi * zj & x?k is solenoidal. 


(b) v(r) = (3x + yji + (2z + x)j + (z — 2y)k 


Е = (2ху2 + 22) + (3222 — y°z’)j + (yz? — xz°)k, 
calculate div f at the point (—1, 2, 3). 


36 = (Spherical polar coordinates) Using the notation 
introduced in Exercise 30, show, working from first 
principles, that 








ын ба ды 
Find V(a : r), (a: V)r and a(V : r), where a is a V:v=divv= yr (r vy) t sin 39 sin 8) 
constant vector and, as usual, r is the position vector 9 
r= (x, y, Z). + L (vy 
rsin 0 3 o) 
The vector v is defined by v 2 rr !, where н 
r= (x, y, Z) and r = |r |. Show that ее n 
] 2 37 A force field F, defined by the inverse square law 
V(V - v) = grad divv =- 4r : i 
(a edie ү? is given by 
F - rl 


Find the value of the constant A such that the vector 


field defined by 


3.3.3 


Show that V: F — 0. 


Curl of a vector field 


It is clear from observations (for example, by watching the movements of marked corks 
on water) that many fluid flows involve rotational motion of the fluid particles. Com- 
plete determination of this motion requires knowledge of the axis of rotation, the rate 


D v% c 
* 
V3 » v3 2 Az 
у 
А Vy B 
k---2 Ay---3 


Figure 3.10 Flow 
around a rectangle. 
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of rotation and its sense (clockwise or anticlockwise). The measure of rotation is thus a 
vector quantity, which we shall find by calculating its x, y and z components separately. 
Consider the vector field v(r). To find the flow around an axis in the x direction at the 
point r, we take an elementary rectangle surrounding r perpendicular to the x direction, 
as shown in Figure 3.10. 

To measure the circulation around the point r about an axis parallel to the x direc- 
tion, we calculate the flow around the elementary rectangle ABCD and divide by its 
area, giving 


[v (x, y*, z — Az)(2Ay) + v,(x, y + Ay, 2*)(2Az) 
— v(x, Ў, 2 + А2)(2Лу) — v3(x, y — Ay, Z)(2Az)]/(4AyAz) 

where y*,  e(y — Ay, y + Ay), 2*, Z €(z — Az, z+ Az) and v = vi + v,j + v3k. 
Rearranging, we obtain 

—[v,(x, f, z - Az) — vx, y*. z - Az)/QAz) 

* [vsGo y + Ay, 2*) - v6, y - Ay, £y Ay) 
Proceeding to the limit as AyAz — 0, we obtain the x component of this vector as 

dy az 
By similar arguments, we obtain the y and z components as 


dv, д» Ov, Ov, 


oz Ox’ Ox ду 


respectively. 
The vector measuring the rotation about a point in the fluid is called the curl 
of v: 


Ov; Ov Ov, Qv Ov, Ov 
lp- 22-00), (2-2) (2-2) 
EU e m Y d Ox ds? " ox ду 


дз дь дл доз до 2a) 
m[—-———-—,.—-— 1 
E oz’ oz ox’ ax ду и 
It may be written formally as 
ПИЕ 
аи 
curl v — QUI (3.18) 
Ui V2 v3 


or more compactly as 


Cunlip Уу хт 
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Example 3.14 


Solution 


AS f 


© 


Figure 3.11 
Circulation around 
the element AS. 


Figure 3.12 
Rotation of a 
rigid body. 


Find the curl of the vector v = (2x — y*, 3z +. x”, 4у – 2?) at the point (1, 2, 3). 


Here v, 2 2x — y?, v 2 3z  x?, v, 2 Ay — Z?, so that 


i j k 

9. д 9 

Әх ду oz 
2x - y! 3z4 x! 4y-z 


curl v = 


Е ds Lg) 2032 ee) 
EL -2)- 20s zi 


+ Көе) - $0 J 


= i(4 — 3) — j(0 — 0) + k(2x + 2y) = i + 2(x + y)k 
Thus, at the point (1, 2, 3), V x v = (1, 0, 6). 


More generally, the component of the curl of a vector field F(r) in the direction of the 
unit vector fi at a point L is found by enclosing L by an elementary area AS that is perpen- 
dicular to fi, as in Figure 3.11, and calculating the flow around AS per unit area. Thus 


» .. flow round AS 
curl F)- 4 = lim ———————— 
( ) RE, AS 


Another way of visualizing the meaning of the curl of a vector is to consider the 
motion of a rigid body. We can describe such motion by specifying the angular velocity 
@ of the body about an axis OA, where O is a fixed point in the body, together with the 
translational (linear) velocity v of O itself. Then at any point P in the body the velocity 
u is given by 


u=v+@xr 
as shown in Figure 3.12. Here v and @ are independent of (x, y, z). Thus 


curl u = curl v + curl (@ x r) = 0 + curl (@ x r) 


A 
| | 
УУР? @ хт tangential velocity 


о à : 
v translation velocity 
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The vector @ x r is given by 
Q X r 2 (09, 0», 05) X Gc у, 2) 
= (02 – 05у) + (05х – @2)ј + (01у – 0х) 


апа 
і j k 
д д д 
(@ хт) = = = = 
EA ox oy oz 
@›2- @,у @0;х- 02 Wi V- Wx 
=2Q@i+20,j+20;k=20 
Thus 
curl u = 20 
that is, 
@= icurlu 


Hence when any rigid body is in motion, the curl of its linear velocity at any point is 
twice its angular velocity in magnitude and has the same direction. 

Applying this result to the motion of a fluid, we can see by regarding particles of the 
fluid as miniature bodies that when the curl of the velocity is zero there is no rotation 
of the particle, and the motion is said to be curl-free or irrotational. When the curl is 
non-zero, the motion is rotational. 


Using MuPAD in MATLAB the command 1inalg :: curl(v, x) computes the 
curl of the three-dimensional vector field v with respect to the three-dimensional 
vector x in cartesian coordinates. For example, the curl of the vector 


о = (2х- 32, 32 +202, 4у – 22) 
considered in Example 3.14, is given by the commands 


delete x, y, z: 


illia slt Eur к= ОА MEME MR C M M QUON US 
IEEE I 
1 
which return the answer 0 
DILLO 


In MAPLE the answer is returned using the commands 


with(VectorCalculus): 

SetCoordinates('cartesian' [ x, y, z]); 

sees Belli xa yi DS SDA tiny 0) 
ИКЕСЕ ИВ? 
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38 


39 


40 


41 


42 
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3.3.4 Exercises 
Find wu = curlv when v = (3xz’, —yz, х + 22). 


A vector field is defined by v = (yz, xz, xy). Show 
that curlv = 0. 


Show that if v = (2x + yz, 2y + zx, 2z + xy) then 
curl v = 0, and find f(r) such that v = grad f. 


By evaluating each term separately, verify the 
identity 
Vx(fv) (V xv) -* (Vf) xv 


for f(r) 2 x? — y and v(r) = (z, 0, —). 


Find constants a, b and c such that the vector field 
defined by 
Е = (Axy * az))i * (bx?  3z)j * (6xz? * cy)k 


is irrotational. With these values of a, b and c, 
determine a scalar function $(x, у, z) such that 
Е = Ұф. 


43 


44 


45 


Ifv 2 —yi * xj * xyzk is the velocity vector of a fluid, 

find the local value of the angular velocity at the 

point (1, 3, 2). 

Ifthe velocity ofa fluid at the point (x, y, z) is given by 
v = (ax + by)i + (cx + dy)j 


find the conditions on the constants a, b, c and d in 
order that 


div v =0, curlv = 0 
Verify that in this case 
v= 1 grad (ax?  2bxy — ay?) 
(Spherical polar coordinates) Using the notation 
introduced in Exercise 30, show that 


V xv = curl v 


и, тше 
гоа a 
rsing|or 90 ag 


U,  YUg 


r Sin Uy 





r Sin v 


3.3.5 Further properties of the vector operator V 


So far we have used the vector operator in three ways: 


Vf = grad f = 214 


У.Е = йу Е = =. 


Vx F=curl F 


р 


of ; 


2 + = 
4 95 
uc oz’ 


_ дБ 
Ox 


Te 


af k, f(r) a scalar field 


F(r) a vector field 


F(r) a vector field 


A further application is in determining the directional derivative of a vector field: 


УЕ (аас љав 


дх д 


д ду 


д 
* (a eu 


+а; 


- (4,2 ofi 


д 


+ аз 


oz 


ofi 
oz 


2), 


“+ (а о a; 22), 


"Ox oy oz 
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The ordinary rules of differentiation carry over to this vector differential operator, but 
they have to be applied with care, using the rules of vector algebra. For non-orthogonal 
coordinate systems a specialist textbook should be consulted. Thus for scalar fields f(r), 
g(r) and vector fields u(r), v(r) we have 


VIEO] = Éve (3.19a) 
VISOON = VAr) * firVa) (3.19b) 
V[u(r):v(r)] 2 v x (V x u) - ux (V xv) * (v: V)u * (u: Vv (3.19c) 
V-[f(ru(r] 2 u: Vf-fV-u (3.19d) 
Ух [дир] = (У) хи + У хи (3.19е) 
V-[u(r) x vr) 2 v-(V xu) - u- (V xv) (3.19f) 
V x [u(r) x v(r)] 2 (v: V)u 2 «(V -u) - (y: V) * u(V-v) (3.19g) 


Higher-order derivatives can also be formed, giving the following: 


div [grad f(r)] - V- Vf — С oi oS Ns (3.20) 
ox ду a 


where V? is called the Laplacian operator (sometimes denoted by A); 
curl [grad f(r)] = V x Vf(r) = 0 (3.21) 


since 
of Z; = 27; (2 — 
V V = ———— —— — — —À —— 
AVE on ozoy ё дхдх дхд= i Әхду дудх г" 
= 0 


when all second-order derivatives of f(r) are continuous; 
div [curl v(r)] 2 V. (V xv) 30 (3.22) 


since 


2 (9 esp 2 (95 a) 2092 2) 


ox\ðy dz) dy\dz ax) OzXOx oy 
ev vere eee oe a) 
grad (div v) = V(V-v) = (22 E 23! A. + = + 2: (3.23) 
2 ОО а , 6 
Nap [2 zn Z nieve (3.24) 


curl [curl v(r)] 2 V x (V x v) 2 V(V-v) - Vv (3.25) 
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Example 3.15 . Verify that V x (V x v) 2 V(V-v) — V’v for the vector field v = (3xz?, —yz, х + 22). 


i j k 

Solution Vxv= 2 2 2 = (y, 6xz- 1, 0) 

3х2” -yz x+2z 
i j k 
д д д 
=| >= = pla -1 

V x(V xv) Ax o x (-—6x, 0, 6z- 1) 

y 6xz-1 0 


Vv- 2.03027) + So) + 29 +22) = 322 -2+2 
V(V -v) = (0, 0, 6z — 1) 
У? = (У?(3х22), У-ух), У(х + 22)) = (6х, 0, 0) 
Thus 
V(V -v) - Vv 2 (-6x, 0, 6z - 1) 2 V x (V x v) 


Similar verifications for other identities are suggested in Exercises 3.3.6. 


Example 3.16 ^ Maxwell's equations in free space may be written, in Gaussian units, as 
(a) div H - 0, (b) divE=0 


(c) culH - Vx p - 19E, (d) culE- Vx E- 19H 
с ді c ot 
where c 1s the velocity of light (assumed constant). Show that these equations are 
satisfied by 
2 
H=12 grad 6 x k, E=-k1204 2 grad ġ 
c ot c ot a 
where 6 satisfies 
2 
ve- i22 
с дї 


and k is a unit vector along the z axis. 


Solution (a) H-41i2grmdóxk 
cot 


gives 
divyH- 1 2 divigadox 5 
c ot 
m! 2 [k- curl (grad $) — (grad ) curl Kk], from (3.19f) 
C 


By (3.21), curl (grad @) = 0, and since k is a constant vector, curl k = 0, so that 
divH=0 
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кф, д 
b Е = -- 5 + — рга 
ee с ӘР az End 
gives 


2 
div E = —1. (к) Е Вата 
с ot^ az 


2 
= = 9» by (3.20) 


д ( 2 1 24) 
= — V -— 

oz ? с? ӘР 
апа ѕіпсе У2ф = (1/с2)92ф/91°, we have 
divE- 0 


(c) curlH= L2 curl (grad @ x k) 
сді 


219г. 
= 3 V)grad ó 


— k (div grad ф) - (grad à: V )k + grad ((V-k)|, from (3.19g) 
=! | grad ф - Kv") , Since k is a constant vector 
c ot\dz 
_ 19Е 
c ot 


2 
(d) curl E = -Łcurl Gy + a curl grad @ 
с t 


AX By a since curl grad ф = 0 by (3.21) 


c Er B 
Also, 


2 
2H = Le grad фхК 
dz à; ; 
= кк @xk), since kis a constant vector 
сд 
с et x 2 oz c 1 D E E or b 
so that we have 


10H 
VxE=--= 
" c ot 
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3.3.6 Exercises 





46 Show that if g is a function of r = (x, y, z) then 52 IfA isa constant vector and r is the position vector 
r = (x, y, Z), show that 
вайс = 19, A:r| A (Ar 
r dr (a) grad( : )+ == 3441), 
r r r 
Deduce that if u is a vector field then 
Axr|) 2A,3 
div [(u x r)g] 7 (r: curl u)g (b) eui( 2 ) p + 54 xr)xr 
47 For (x, y, z) = x?y?z? and 53 Tfr is the position vector r = (x, y, z), and a and b 
F(x, y, z) 2 x?yi * xy?zj — yz?k determine are constant vectors, show that 
(а) Уф (Ы) graddivF (c) curl curl F (a) Vxr=0 
(b) (a- V)rza 
48 Show that if a is a constant vector and r is the 
position vector r = (x, y, z) then (с) V x[(a: r)b - (b : r)a] 7 2(a x b) 
div (grad [(r- rr -a)]) — 10r - a) (d) V:[(a:r)b - 6: r)a]-0 


54 By evaluating V (Vf), show that the Laplacian 


49 Verify the identit 
y * in spherical polar coordinates (see Exercise 30) is 


V?y — grad div v — curl curl v given by 
for the vector field v = x?y(xi -- yj + zk). Vf = 12 (r2) şt Z sino X ) 
2.“ ar/ r'sin890 дө 
50 Verify, by calculating each term separately, " 1 at 
the identities г? sin 899? 
div (u x v) - v:curlu — u-curly 55 Show that Maxwell's equations in free space, namely 
curl (u x v) = u div v — v div u + (v: V)u div H = 0, divE=0 
— (u: Vv 
ж vxH=1&, vxg--198 
when u - xyj * xzk and v = xyi + yzk. А б 
are satisfied by 
51 Tfr is the usual position vector r = (x, y, z), ше Л д7 
show that T DE 


(a) div graa( + } =0 E = curl curl Z 
ў where the Hertzian vector Z satisfies 


2 
(b) curl kx graa{ 4) + grad k graa( + ) =0 VZ= 102 
4 F c ot 


Topics in integration 


In the previous sections we saw how the idea of the differentiation of a function of a 
single variable is generalized to include scalar and vector point functions. We now turn 
to the inverse process of integration. The fundamental idea of an integral is that of 


Figure 3.13 Definite 
integral as an area. 


3.4.1 





Figure 3.14 Integral 
along a curve. 
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summing all the constituent parts that make a whole. More formally, we define the 
integral of a function f(x) by 


b n 
| fG)dx- lim Y fG)Ax, 
nc 4 
s all Ax,0 =! 
wherea 2 xy « xy € x; € ... x, < x, = Б, Ах = x; - x; 4, and xj, € X; €& x, 
Geometrically, we can interpret this integral as the area between the graph y = f(x), the 
x axis and the lines x = a and x — b, as illustrated in Figure 3.13. 


у=) 








Line integrals 


Consider the integral 


| f(x,y)dx, where y = g(x) 


b 


This can be evaluated in the usual way by first substituting for y in terms of x in the 
integrand and then performing the integration 


b 
| Лх, g(x) dx 


Clearly the value of the integral will, in general, depend on the function y = g(x). It may 
be interpreted as evaluating the integral f. : fx, y)dx along the curve y = g(x), as shown 
in Figure 3.14. Note, however, that the integral is not represented in this case by the 
area under the curve. This type of integral is called a line integral. 

There are many different types of such integrals, for example 


B B ty B 
| f(x, y) dx, | f(x, y) ds, | f(x, y) dt, | Lfilx, y) dx f, y) dy] 
А А ty A 
C С C c 

Here the letter under the integral sign indicates that the integral is evaluated along the 
curve (or path) C. This path is not restricted to two dimensions, and may be in as many 
dimensions as we please. It is normal to omit the points A and B, since they are usually 
implicit in the specification of C. 
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Example 3.17 


Figure 3.15 
Portion of circle. 


Solution 


Example 3.18 


Solution 


Evaluate f. xy dx from A(1, 0) to B(0, 1) along the curve C that is the portion of x? - y?- 1 
in the first quadrant. 


y 
B 
0,1 
(0, 1) б 
А 
o (1,0) x 


The curve C is the first quadrant of the unit circle as shown in Figure 3.15. On the curve, 
y= \(1 — x’), so that 


0 

| xydx = | ху -x5dx - [-}20- - 
1 

C 


Evaluate the integral 


I- [icf catenas 


C 


from A(0, 1) to B(2, 3) along the curve C defined by у = х + 1. 


The curve C is the straight line y = x - 1 from the point A(0, 1) to the point B(2, 3). 
In this case we can eliminate either x or y. Using 


у=х+1 апа dy=dx 


we have, on eliminating y, 


x-2 
r=] [D 2 1)] dx [x - (x 4 1)] dx 


х=0 


2 
2 3 : А 
= | (Qx + 5x+3)dx = [fx ^ ix +3x],= 5 
0 


In many practical problems line integrals involving vectors occur. Let P(r) be a point 
on a curve C in three dimensions, and let t be the unit tangent vector at P in the 
sense of the integration (that is, in the sense of increasing arclength s), as indicated in 
Figure 3.16. Then £ds is the vector element of arc at P, and 


tds = dx, , dy, , dz, ds=dxi+dyj+dzk=dr 
ds ds ds 
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Figure 3.16 
Element of arclength. 





If fi, y, z), fox, у, z) and f(x, y, z) are the scalar components of a vector field F(r) then 


| [fiGo, y, z) dx + f(x, y, z) dy * fs, y, z) dz] 


= | Aa, y, z) E ds + f(x, y, z) Ë ds + f(x, y, z) ds 
P ds ds ds 


-| F-tds -| F-dr 
С С 


Thus, given a vector field F(r), we can evaluate line integrals of the form fc F : dr. In 
order to make it clear that we are integrating along a curve, the line integral is some- 
times written as fc F: ds, where ds — dr (some authors use d/ instead of ds in order to 
avoid confusion with dS, the element of surface area). In a similar manner we can 
evaluate line integrals of the form fo F x dr. 


Example 3.19 Calculate (a) fe F- dr and (b) fe F x dr, where C is the part of the spiral r = (a cos б, 
asin 0, a0) corresponding to 0 < 0 < іл, апа Е = г21. 


Solution The curve C is illustrated in Figure 3.17. 
(a) Since r= acos ĝi + asin 0j + аӨк, 
dr 2 Casin 0d0i - acos 0d0j - adOK 
so that 
F-:dr- r;i: (Casin 0d0i - acos 0d0j *- a dOK) 
- —ar?sin 0d0 
— —a'(cos?0 4 sin?0 4 0?) sin 0d0 — —a'(1-- 0?) sin dO 





since r = |r| = \(a’ cos’ + a’ sin^0 + à?0?). Thus, 


Figure 3.17 


The spiral n/2 
r — (a cos 6, F-dr=-a | (1+ 6°) sin@d@ 
asin 0, a0). с 


0 
— —q'" [cos 0 4- 2 0 sin 0 - 6^ cos 0]. , using integration by parts 


= -a'(n - 1) 
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i j k 
(D Fxdr- ГА 0 0 
-а5іп040 асоѕ010 айӨө 
= —ar? d0j * ar? cos 0d0k 


—a*(1 4 Ө?)абӨ] + а?(1 + Ө?) соз ӨДӨК 
so that 


n/2 n/2 
К aea eaa | (14 9?) cos 0d0 
С 


0 0 


3 3 
та 2., a 2 
=- (12 T(n? -4)k 
A T (d 4) 


F(r) The work done as the point of application of a force F moves along a given path 
t C as illustrated in Figure 3.18 can be expressed as a line integral. The work done 
C as the point of application moves from P(r) to P'(r + dr), where PP’= dr, is 
F dW =|dr||F|cos 0= F: dr. Hence the total work done as P goes from A to B is 


Figure 3.18 Work done 
by a force F. 


W= | F-dr 
C 
In general, W depends on the path chosen. If, however, F(r) is such that F(r): dr is an 
exact differential, say —dU, then W= fe — dU = U, — Ups, which depends only on A and 
B and is the same for all paths C joining A and B. Such a force is a conservative force, 
and U(r) is its potential energy, with F(r) =—grad U. Forces that do not have this prop- 
erty are said to be dissipative or non-conservative. 

Similarly, if v(r) represents the velocity field of a fluid then $c v dr is the flow 
around the closed curve C in unit time. This is sometimes termed the net circulation 
integral of v. If v -dr — 0 then the fluid is curl-free or irrotational, and in this case v 
has a potential function 9 (r) such that v — —grad 9. 


3.4.2 Exercises 


56 Evaluate fy ds along the parabola y ^ 2x from 59  IfA- (y * 3i * xzj * (yz — x)k, evaluate fe A-dr 
2 1 the followi ths C: 
AQ, 243) to B(24, 4/6). [Recall: (s) =1+ (=) Л НЕОЛ И © 


(а) х= 22, у=}, 2= ? боті = 0 юѓ = 1; 
(b) the straight lines from (0, 0, 0) to (0, 0, 1), 


57 Evaluate f5 [2xy dx - (x? — y?) dy] along the arc 
a [ay ( then to (0, 1, 1) and then to (2, 1, 1); 


of the circle x? - y? 2 1 in the first quadrant from 


58 


A(1, 0) to B(0, 1). 


Evaluate the integral fc V: dr, where 60 
И = (2у= + 322, у? + 4х2, 222 + бху), апа С іѕ ће 

curve with parametric equations x = t’, y = t’, z = t 

joining the points (0, 0, 0) and (1, 1, 1). 


(c) the straight line joining (0, 0, 0) to (2, 1, 1). 


Prove that F 2 (y?cosx * z)i * (2ysinx — 4)j 

+ (3xz* + z)k is a conservative force field. Hence 
find the work done in moving an object in this field 
from (0, 1, —1) to (1/2, —1, 2). 
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61 Find the work done in moving a particle in the force any curve C joining the point (0, 0, 0) to the 
field F = 3x7i + (2х2 — y)j + zk along point (1, 2, 3). 
2 dors, 
(a) ше ке or by x" = 4y, 3x° = 82 from 63  IfFz-xyi— zj - xk and C is the curve x 2 £?, y = 21, 
m о ns. z 2 Ü from f 2 0 tot I, evaluate the vector line 
(b) the straight line from (0, 0, 0) to (2, 1, 3). integral fo F x dr. 
(c) Does this mean that F is a conservative force? 
Give reasons for your answer. 64 IfA = (3x+y, -x, y —z) and B = (2, -3, 1) 
evaluate the line integral $c (A4 x B) x dr around 
62 Prove that the vector field F 2 (3x? — y, 2yz? — x, the circle in the (x, y) plane having centre at the 


2y?z) is conservative, but not solenoidal. Hence 
evaluate the scalar line integral fo F -dr along 


3.4.3 Double integrals 


origin and radius 2, traversed in the positive 
direction. 


In the introduction to Section 3.4 we defined the definite integral of a function f(x) of 


one variable by the limit 


nw ^4 
all Ax 50 /71 


| ДО) ах = lim Y, fG) Ax; 


wherea 2 xy « xy « x4 € . 


< х= b, Ax;— x; — x; , and x, , « X; < x;. This integral 


is represented by the area between the curve y — f(x) and the x axis and between x = a 


and x = b, as shown in Figure 3.13. 


Now consider z — f(x, y) and a region R of the (x, y) plane, as shown in Figure 3.19. 
Define the integral of f(x, y) over the region R by the limit 


(|. y)dA- 


r all Ad; 30 771 


lim Y, f£, Jı) A4, 


where A4; (i2 1,...,n) is a partition of R into n elements of area AA; and (X; j;) is 
a point in AA,. Now z — f(x, y) represents a surface, and so f(X,, Y; A4; 2 Z; AA; is the 
volume between z = 0 and z = Z, on the base A4A,. The integral Jf, f(x, y) dA is the limit 
of the sum of all such volumes, and so it is the volume under the surface z = f(x, y) above 
the region R. 


Figure3.19 Volume 
as an integral. 
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Ау; 


о 





О — «— X 


Figure 3.20 A possible grid for the partition of R Figure 3.21 Another possible grid for the partition of R 


(rectangular cartesian). (polar). 


The partition of R into elementary areas can be achieved using grid lines parallel to 
the x and y axes as shown in Figure 3.20. Then AA; 2 Ax; Ay, and we can write 


J| y)dA -ff f(x, y)dxdy = lim УЛ, Ў) Ах, Ау; 


R R 


Other partitions may be chosen, for example a polar grid as in Figure 3.21. Then the 
element of area is (r, AG) Ar; 2 AA; and 


|[ y) dA =|) rece 0, rsinQ)rdrd0O (3.26) 


R R 


The expression for AA is more complicated when the grid lines do not intersect at right 
angles; we shall discuss this case in Section 3.4.5. 

We can evaluate integrals of the type J, f(x, y) dx dy as repeated single integrals in 
x and y. Consequently, they are usually called double integrals. 

Consider the region R shown in Figure 3.22, with boundary ACBD. Let the curve 
ACB be given by y = g,(x) and the curve ADB by y = g,(x). Then we can evaluate 
SS x f(x, y) dx dy by summing for y first over the Ay,, holding x constant (x = X5, say), 
бот у = g,(x,) to y = g,(x,), and then summing all such strips from A to B; that is, from 
x=atox=b. Thus we may write 


|| fe.y)d4- lim S/S AS. y)Ay,| Ax, (n= min(n, n,)) 


all Ax;, Ay;0 і=1 | j=l 


b y-g5(x) 
-| | fes nay} 
a LJ y2g, G2) 


Here the integral inside the brackets 1s evaluated first, integrating with respect to y, 
keeping the value of x fixed, and then the result of this integration is integrated with 
respect to x. 


R 
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y 7 82(x) 






R 






Na 


y280) B 
/ 











(a) (b) 
Figure 3.22 The region R. 


Alternatively, we can sum for x first and then y. If the curve CAD is represented by 
x = h,(y) and the curve CBD by x = /,(), we can write the integral as 


| | fG.X)dA- lim У У Лх, Ахду (пе тіпт, л) 


all Ау, Ax,20 j=l i=l 


d x=hy( y) 
2 | fes ds 
c xzhi(y) 
If the double integral exists then these two results are equal, and in going from one to 
the other we have changed the order of integration. Notice that the limits of integration 


are also changed in the process. Often, when evaluating an integral analytically, it is 
easier to perform the evaluation one way rather than the other. 


R 


Example 3.20 . Evaluate ff, (x? - y?) dA over the triangle with vertices at (0, 0), (2, 0) and (1, 1). 


Figure 3.23 Domain 
of integration for 
Example 3.20. 





Solution The domain of integration is shown in Figure 3.23(a). The triangle is bounded by the 
lines y 20, y 2 x andy 2 2 – х. 
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(a) 


(b) 


Integrating with respect to x first, as indicated in Figure 3.23(b), gives 


fees = | | (х'+у)ахау 


R 


1 
= | P *YXpES'dy 


0 
1 
-| [5-4y + 4 - 3v ]dy- $ 
0 


Integrating with respect to y first, as indicated in Figure 3.23(c), gives 


1 ГУЕ 2 [у=2-х 

J| (X 4 y^) dA = | | (х eese] | (х^ + у?) dy dx 
0J y=0 1J yz0 

R 


Note that because the upper boundary of the region R has different equations for 
it along different parts, the integral has to be split up into convenient subintegrals. 
Evaluating the integrals we have 


1 ру=х 1 1 
| | (х? + у?) дух -| [xy * b T dx -| ty? dx =} 


oJ y=0 0 0 


2 fy=2-x 2 
| | (x +y")dy dx = | [xy +ty Poy dx 


1J y=0 1 


2 
-| (6 - 4х + 4х - $))dx 21 


1 


Thus 


J| ee =i+1=4, as before 


R 


Clearly, in this example it is easier to integrate with respect to x first. 


Example 3.21 Evaluate ff (x  2y) ^ dA over the region x — 2y « 1 and x 2 y? 4 1. 


Figure3.24 Domain 
of integration for 
Example 3.21. 
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Solution The bounding curves intersect where 2y + 1 = y? + 1, which gives y = 0 (with x = 1) 
and y = 2 (with x = 5). The region R is shown in Figure 3.24. In this example we choose to 
take x first because the formula for the boundary is easier to deal with: x = y? + 1 rather 
ап у = (x — 1)”. Thus we obtain 


2 p2y+l 
озат -| |. (x 4 2y) "^ dx dy 
OJ yt 


R 


2 
- | [2(x + 2y) ^T 35 dy 


0 


2 
-| [2(4y + 1)'? - 2(y+ 1] dy 


0 


= [1(4y +1)? - y’ - 2y] = 3 


As indicated earlier, the evaluation of integrals over a domain R is not restricted 
to the use of rectangular cartesian coordinates (x, y). Example 3.22 shows how polar 
coordinates can be used in some cases to simplify the analytical process. 


Example 3.22 Evaluate f,x°y dA, where R is the region x* + y* < 1. 
У 


Solution The fact that the domain of integration is a circle suggests that polar coordinates are a 
natural choice for the integration process. Then, from (3.26), x 2 rcos 0, y 2 rsin 0 and 
dA = габат, апа Ше integral becomes 


1 2n 
E dA -| | r cos @ rsin@ rd@dr 
r=0 J 6=0 


R 
1 2n 
= | | r^ cos 0 sin 0 ddr 
r-0 J 0-0 


Note that in this example the integration is such that we can separate the variables r and 
0 and write 


1 2n 
ffe dA = | | cos @ sin@ d@dr 
r=0 6-0 


R 


Furthermore, since the limits of integration with respect to 0 do not involve r, we can 
write 


1 2n 
E dA -| ea | cos 0 sin 0 dO 
r=0 0-0 


R 
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and the double integral in this case reduces to a product of integrals. Thus we obtain 


[ 4А = [1°],[—1соз'Ө] = 0 
R 


Reflecting on the nature of the integrand and the domain of integration, this is the result 
one would anticipate. 


(m There are several ways of evaluating double integrals using MATLAB. The simplest 
uses the command db1quad (f, xy, xj, yo, yj). For example, consider 


DS 
| | (x7 + y’) dx dy 
ПЕ! 


Here we define the integrand as an inline function 

LES MITIS РИ шс, шу 
(Note that x is taken as a vector argument.) 

i= édlolgwmacl (QE . 7 , 2, 0, 3) 
returns the answer 

INI 


For non-rectangular domains, the same command is used but the integrand 1s 
modified as shown below. Consider 


Пу 
| | (х? + у?) ах ду 
0J 0 


from Example 3.20 (b). Here we define the integrand as the inline function 
Е Б ЖЕ о ое ш = ОКЕ К КАЕ УОЛ 
where the logical expression (y - x <= 0) returns 1 if the expression is true and 
0 otherwise, so that the command 
t= оше Ge , 0 , 1, © MEER) 
returns the required answer 
Lt = 0.3333 


despite integrating over a rectangular domain. 


3.4.4 Exercises 


65 Evaluate the following: 66 Evaluate 
3 r2 3 r5 
2 

(a) | | хубх+у)бубх (Ы) | | xy dy dx | | s 
0J1 241 y 
1 2 

(с) | | (2x +y )dydx over the rectangle bounded by the lines x = 0, 
-1 J -2 x=2,y=landy=2. 


67 


68 


69 


70 


71 


72 


Evaluate ff (x? +”) dx dy over the region for which 
x20,y20andx+y<1. 


Sketch the domain of integration and evaluate 


2 2x 1 1-х 
(а) | af >A (b) | af (x+y)dy 


1 x+y 0 0 


! Е 1 
(с) | dx | e ly 

0 (o J(1-x -y) 
Evaluate ff sin i n(x + y) dx dy over the triangle 
whose vertices are (0, 0), (2, 1), (1, 2). 


x 


Sketch the domains of integration of the double 
integrals 


1 1 
(a) E ydy 
0 x Vd ty) 


n/2 у 
(Ы) | dy | (cos2y) (1 - & sin^x) dx 


0 0 


Change the order of integration, and hence evaluate 
the integrals. 


Evaluate 


1 1 
0 i Nby(1 -x)] 


Sketch the domain of integration of the double 
integral 


1 dox) 
| | — r dydy 
0J 0 y +y) 


3.4.5 Green's theorem in a plane 


3.4 TOPICS IN INTEGRATION 225 


Express the integral in polar coordinates, and hence 
show that its value is i : 


Sketch the domain of integration of the double 
integral 


1 yx’) 
| | v d 
0 0 x +y) 


and evaluate the integral. 


Evaluate 


|26 
x+y +a 


over the portion of the first quadrant lying inside the 
circle x? +y? =a". 


By using polar coordinates, evaluate the double 
integral 


2 2 
[= ; dx dy 
X +y 


over the region in the first quadrant bounded by the arc 
of the parabola y* = 4(1 — x) and the coordinate axes. 


By transforming to polar coordinates, show that the 
double integral 


2 2 
| tu d 


taken over the area common to the two circles 
x? +y’ = ax and x? + y? = by is ab. 


This theorem shows the relationship between line integrals and double integrals, 
and will also provide a justification for the general change of variables in a double 


integral. 


Consider a simple closed curve, C, enclosing the region A as shown in Figure 3.25. If 
P(x, y) and Q(x, y) are continuous functions with continuous partial derivatives then 


e 


C A 


ӘР 


A TCR 


(3.27) 
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Figure 3.25 Green's 
theorem. 


Example 3.23 


Solution 


xzg O0) 


xz 80У) 





where C is traversed in the positive sense (that 1s, so that the bounded area is always on 
the left). This result is called Green's theorem in a plane. 

The proof of this result is straightforward. Consider the first term on the right-hand 
side. Then, with reference to Figure 3.25, 


20 d| [2,09 20 
уе вер 
R n 


| [Q(gi(). ») - Q(g1C). »)] dy 


- | ona | oc nay 


LMN LKN 


= | es Nay= $ Ole.) 


LMNKL C 
Similarly, 
-|| OP ax dy = bro. y)dx 
ду 
4 C 
and hence 
(2 : æ) dxdy = $ [P(x, y) dx + Q(x, y) dy] 
ox ду 


A 


An elementary application is shown in Example 3.23. 


Evaluate $ [2x(x + y) dx + (x? + xy + y?) dy] around the square with vertices at (0, 0), 
(1, 0), (1, 1) and (0, 1) illustrated in Figure 3.26. 


Неге Р(х, у) = 2x(x + y) and Q(x, y) = xX + ху + у”, so that OP/dy = 2x, дО/дх = 2x + y 
and 0Q/ox — дР/ду = у. Thus the line integral transforms into an easy double integral 
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(0, 1) (1,10) Е +y) dx 4 (x! 4 xy y^) dy] = | dx dy 


C 
Jl ydxdy 
0,0) x 


Figure 3.26 Path 
of integration for -| ydy f dx= 1 


Example 3.23. 


(0, 0) 


p 


It follows immediately from Green's theorem (3.27) that the area A enclosed by the 
closed curve C is given by 


A= | bars boa- iby deed 


А C с C 


Suppose that under a transformation of coordinates x = x(u, v) and y = y(u, v), the curve 
becomes C’, enclosing an area Æ’. Then 


A' - [|o = pe =$: (Garr Ф) 
C 


A’ Es 


aajo 


A 
ди ди av ди ди 
+ +и 
- | [2:2 dy ^ ol D àx ^ 


| ди до ди до 








pu al dy 





a 


з у ®® 


А 


This implies that the element of area du dv is equivalent to the element 


| & до ди æ) 


Әх ду дудх 


Here the modulus sign is introduced to preserve the orientation of the curve under the 
mapping. Similarly, we may prove that 


dx dy 





dede Le ade (3.28) 


o(u, v) 





where 0(x, y)/O(u, v) is the Jacobian 


Ox Oy _ Ax By _ 
идо vou Jen y) 
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Example 3.24 


Solution 


Figure 3.27 

Domain of 
integration for 
Example 3.24: 

(a) in the (x, y) plane; 
(b) in the (u, v) plane. 


This enables us to make a general change of coordinates in a double integral: 


|| f(x, y) dxdy = || roa v), y(u, v))|J| dudv (3.29) 


where Æ’ is the region in the (u, v) plane corresponding to A in the (x, y) plane. 
Note that the above discussion confirms the result 


dlu, v) _ | X(x, y) S 
a(x, y) | Au, v) 


as shown in Section 3.1.3. Using (3.29), the result (3.26) when using polar coordinates 
is readily confirmed. 


Evaluate ff xy dx dy over the region in x = 0, y = 0 bounded by y =x? + 4, y=x’, 
y26-x andy-12-x?. 


The domain of integration is shown in Figure 3.27(a). The bounding curves can be 
rewritten as y - x! 2 4, y - x 20, y + x? = 6 and y + x? = 12, so that a natural change 
of coordinates is to set 


и=у+х?, о=у-х? 


Under this transformation, the region of integration becomes the rectangle 6 « u < 12, 
0 <v <4, as shown in Figure 3.27(b). Thus since 


_ Ax, y) _ | Au, v) ГЕЯ 
Е dlu, v) | a 2 4х 


the integral simplifies to 


[| xy dx dy = || xy fe dud 


A A’ 








(a) (b) 


Figure 3.28 
Three-dimensional 
generalization of 
Green's theorem. 
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Hence 


|| xy dxdy = į [paa = [f (u+v)dudv, since y = (u + v)/2 


A A’ At 
4 12 
| af (u+v)du = 33 
0 6 


We remark in passing that Green’s theorem in a plane may be generalized to three 
dimensions. Note that the result (3.27) may be written as 


fe, Q, 0)-dr = [| curl [(P, Q, 0)] - Kk dx dy 


С А 


For a general surface S with bounding curve C as shown in Figure 3.28 this identity 
becomes 


} F(r):dr- || curl F(r) - d$ 


E S 


where dS = fi dS is the vector element of surface area and fi is a unit vector along the 
normal. This generalization is called Stokes’ theorem, and will be discussed in 
Section 3.4.12 after we have formally introduced the concept of a surface integral. 





Surface S 


3.4.6 Exercises 


77 Evaluate the line integral 


$ siny dxs = coss) 0 


С 


Verify your answer using Green's theorem in a plane. 


78 . Use Green's theorem in a plane to evaluate 


} [G^ - y) dx - (x »^) dy] 


C 


taken in the anticlockwise sense, where C is the 
perimeter of the triangle formed by the lines 


у=}тх, y75T, x=0 


Nie 


as a double integral, where C is the triangle with 
vertices at (0, 0), (2, 0) and (2, 2) and is traversed 
in the anticlockwise direction. 
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79 Evaluate the line integral 81 Evaluate 


a 2a-x 

I= ф (xy dx *x dy) | af ———À dy 
0 , 4а +(у+х) 

c 

using the transformation of coordinates u = x + y, 


where C is the closed curve consisting of y = x? v=x-y 


from x = 0 to x = 1 and y = yx from x = 1 to x = 0. 
Confirm your answer by applying Green’s theorem 82 Using the transformation 
in the plane and evaluating / as a double integral. 


= Ў 
: 5 + у = A — = 
80 . Use Green's theorem in a plane to evaluate the line qc сЕ 
inti 1 
integra. show that 
* -3y)) dx (e! - 4x) d е" : m Е 
} [(е y ) X (e x ) y] dy IHY dx= du edv =e- 1 
С 0 y Х 0 0 


where C is the circle x? + y? = 4. (Hint: use polar 
coordinates to evaluate the double integral.) 


3.4.7 Surface integrals 


The extensions of the idea of an integral to line and double integrals are not the only 
generalizations that can be made. We can also extend the idea to integration over a 
general surface S. Two types of such integrals occur: 


(a) Jie» 2)4$ 


$ 


(b) J| но aas- [| F(r): dS 


S S 


In case (a) we have a scalar field f(r) and in case (b) a vector field F(r). Note that 
dS = fidS is the vector element of area, where fi is the unit outward-drawn normal 
vector to the element dS. 
In general, the surface S can be described in terms of two parameters, u and v say, 
я so that on S 
C, (r(u, v9)) 





r — r(u, v) — (x(u, v), y(u, v), z(u, v)) 


The surface S can be specified by a scalar point function C(r) 2 c, where c is a 
constant. Curves may be drawn on that surface, and in particular if we fix the value of 
C(r( v) One of the two parameters u and v then we obtain two families of curves. On one, 
CAP (Uo, v)) C (r(u, vo)), the value of u varies while v is fixed, and on the other, C,(r(u,, v)), the 
Figure 3.29 Parametric Value of v varies while u is fixed, as shown in Figure 3.29. Then as indicated in 
curves on a surface. Figure 3.29, the vector element of area d$ is given by 
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or or = ar, or 
45 = py du x — Эр dv = : x du dv 
_ (ax oy 2) >» ду | 2 
& 2s Du x ov av à du dv 2 (Jii - J;j - J4k) dudv 
where 
_ ду 92 _ ду д2 _ 92 дх _ д2 дх _ 9х ду _ дх ду 
^ ди до доди’ ^ ди до доди’ % Ou dv доди 890) 
Hence 


[fro . dS = || (Р, + ОЛ» + RJ) dudv 


IIS y, z)dS= [| v)4(J1- J2-- J2) du dv 


where F(r) = (P, Q, R) and A is the region of the (u, v) plane corresponding to S. Here, 
of course, the terms in the integrands have to be expressed in terms of u and v. 

In particular, u and v can be chosen as any two of x, y and z. For example, if z = z(x, y) 
describes a surface as in Figure 3.30 then 


r = (x, y, z(x, y)) 
with x and y as independent variables. This gives 


л=-0, = P deu 


| 45 = IIS: -0% Z + R]dxdy (3.312) 
|| f(x,y, 2) dS = | | / (х,у, :(х, Dfi ' +(2) + (3) ava (3.31) 


5 А 


апа 


2 Surface S 


Figure 3.30 A surface 
described by Element dS 
z — z(x, y). 
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Example 3.25 . Evaluate the surface integral 


|| (х+у+2) 15 


5 


where S is the portion of the sphere x” + y? + 22 = 1 that lies in the first quadrant. 


Figure 3.31 

(a) Surface S for 
Example 3.25; 
(b) quadrant of a 
circle in the (x, y) 
plane. 


(0, 1) 


о (1,0) х 


(b) 





Solution The surface S is illustrated in Figure 3.31(a). Taking 


zz (1-х? - у?) 


we have 
д2 _—х___ DE e ees 
dx ü-x-y) ду (ü-x-y) 
giving 
14 (&j «(y _ x ?+(1-х°- 
дх ду (1-x' - y)) 
= 
(a -x - y) 


Using (3.17) then gives 


J| «renes [| orna eee 
X0 -x -y) 


S А 


where A is the quadrant of a circle in the (x, y) plane illustrated in Figure 3.31(b). 
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Thus 


Xa - x) 
[feras = af Euch X + ——————— +1 ау 
(1-х-у) qGex y) 


TEES 
=| xsin (с ;J- (=x - y) *y dx 
a -x») 


S 


0 


Е 424 -x ув 


1 
= Besa -x)4 sis 
4 0 


- 3 
- d 


An alternative approach to evaluating the surface integral in Example 3.25 is to evaluate 
it directly over the surface of the sphere using spherical polar coordinates. As illustrated 
in Figure 3.32, on the surface of a sphere of radius a we have 


x =asin@ cos Q, у= аѕіп Өѕіпф 


2 = ас05 0, dS = a° sin 0 d0 dọ 


Figure 3.32 Surface 
element in spherical 
polar coordinates. 





a sing 


In the sphere of Example 3.25 the radius a = 1, so that 


п/2 Гт/2 

fenes] | (sin 0 cos ó - sin 0 sin 0 - cos 0) sin 0 d0 dó 
0 0 

S 


n/2 
-| [1 л соѕф+1лѕіпф+;]4ф= 


0 


as determined in Example 3.25. 
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Figure 3.33 
Surface element 
in cylindrical 
polar coordinates. 


Example 3.26 


Figure 3.34 (a) torus 
of Example 3.26; 

(b) position vector ofa 
point on the surface of 
the torus. 


Solution 


In a similar manner, when evaluating surface integrals over the surface of a cylinder of 
radius a, we have, as illustrated in Figure 3.33, 


x=acos @, у= аѕіп ф, Em dS=adzdo 





Find the surface area of the torus shown in Figure 3.34(a) formed by rotating a circle 
of radius b about an axis distance a from its centre. 





Locus of 


From Figure 3.34(b), the position vector r of a point on the surface is given by 
r = (а + Бсоѕ ф) соѕ 61 + (а + Р соѕ ф) ѕіп Өј + bsin bk 


(Notice that 0 and $ are not the angles used for spherical polar coordinates.) Thus 
using 3.16, 


Л = (а + bcos $) cos O(b cos ф) — (—b sin $ sin 0)(0) 
J; = (0)(—Ь їп 9 cos 0) — (b cos 6)(a + bcos Q)(^sin 0) 
J; =—(a + bcos @) sin O(—bsin $ sin 0) — (—b sin @ cos O)(a + b cos $) cos 0 


Example 3.27 


Solution 
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Simplifying, we obtain 
J, = b(a + bcos @) cos 0 cos @ 
J, = b(a + bcos @) sin 8 cos ó 
J; = b(a + b cos ф) sin ó 

and the surface area is given by 


2n (27 
i | 0071 +75 +73) dO do 


0 0 


- | | b(a * bcos 9) а0аф 


0 0 
- 4m) ab 


Thus the surface area of the torus is the product of the circumferences of the two circles 
that generate it. 


Evaluate ffs V- dS, where V = zi + xj — 3y/zk and S is the surface of the cylinder 
x? + y? 2 16 in the first octant between z = 0 and z — 5. 


The surface S is illustrated in Figure 3.35. From Section 3.2.1, the outward normal to 
the surface is in the direction of the vector 


n = grad (x? + y* — 16) = 2xi + 2yj 
so that the unit outward normal fi 1s given by 
й= 2хї+2у] 
2494 y) 
Hence on the surface x? + y? = 16, 
=; @i+yj) 
giving 
15 = 150 = 1@5(хї + yj) 


Projecting the element of surface dS onto the (x, z) plane as illustrated in Figure 3.35, 
the area dx dz of the projected element is given by 


dxdz=dScosB 


where f is the angle between the normal ñ to the surface element and the normal j to 
the (x, z) plane. Thus 


dxdz-dS|Á j| 2 idS|Gi - yj) jI2 idSy 
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Figure 3.35 
Surface S for 
Example 3.26. 








giving 


dS =4dx dz 
y 
Also, 
V-dS=V-AdS=(i+x- зу). (ЖЕУ = XZH dxdz 
y y 


so that 


ав || аа 
yY 


S A 


where 4 is the rectangular region in the (x, z) plane bounded by 0 € x = 4,0 < 2 < 5. 
Noting that the integrand is still evaluated on the surface, we can write y = (16 — x?), 
so that 


ив |) Бата 
0J0 «(16-7 x) 


s 
4 2 5 
=| Xz4 ————— dx 
‚| 2ү(16-х) 


0 


4 
=| т |4, 
‚| 2ү(16-х) 


= [i - 2016 - х*)]1; 
= 90 


An alternative approach in this case is to evaluate į ff, (xz + xy) dS directly over 
the surface using cylindrical polar coordinates. This is left as Exercise 90, in Exer- 
cises 3.4.8. 


83 


84 


85 


86 


3.4.8 Exercises 


Evaluate the area of the surface z 2 2 – x? — y? lying 
above the (x, y) plane. (Hint: Use polar coordinates 
to evaluate the double integral.) 


Evaluate 


(a) ffsG y?) dS, where S is the surface area of 
the plane 2x + y + 2z = 6 cut off by the planes 
z=0,z=2,y=0,y=3; 

(b) ffszdS, where S is the surface area of the 
hemisphere x? + y* + z* = 1 (z > 0) cut off 
by the cylinder x? — x 4 y? 2 0. 


Evaluate ffov: dS, where 


(a) v= (xy, —x?, x * z) and S is the part of 
the plane 2x + 2y + z = 6 included in the 
first octant; 

(b) v= (3y, 2x’, z*) and S is the surface of the 
cylinder x? - y? 2 1,0 « z « 1. 


Show that ff,z? dS = 2m, where S is the surface of 
the sphere x? +y? +z? = 1, z > 0. 


3.4.9 Volume integrals 


87 


88 


89 


90 


9l 
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Evaluate the surface integral ffs U(x, y, z) dS, 
where S is the surface of the paraboloid 

z=2 — (x° + y?) above the (x, y) plane and 
U(x, y, z) is given by 


(а) 1 (Ы) х?+у? (oz 


Give a physical interpretation in each case. 


Determine the surface area of the plane 
2x +y+2z = 16 cut off by x =0, v=0 
and x? + y? 2 64. 


Show that the area of that portion of the surface 
of the paraboloid x? + y? = 4z included between 
the planes z 2 1 and z 2 3 is £ n(4 — 2). 


Evaluate the surface integral in Example 3.27 using 
cylindrical polar coordinates. 


If F = yi + (x — 2xz)j — xyk, evaluate the surface 
integral ff, (curl F): dS, where S is the surface of 
the sphere x? - y? - z22 a, z > 0. 


In Section 3.4.7 we defined the integral of a function over a curved surface in three 
dimensions. This idea can be extended to define the integral of a function of three 
variables through a region 7 of three-dimensional space by the limit 


|) еә lim Y Ge ps 2)AV, 


F 


all A,0 


і=1 


where AV, (i2 1,...,n)isa partition of T into n elements of volume, and (X, ў, 2) is 
a point in AV; as illustrated in Figure 3.36. 
In terms of rectangular cartesian coordinates the triple integral can, as illustrated in 


Figure 3.37, be written as 


[pear Fl 


T: 


g5(x) 


g(x) 


h(x, y) 
Z Р(х, у, 2) й 


(3.32) 


hy, y) 


Note that there are six different orders in which the integration in (3.32) can be 


carried out. 


As we saw for double integrals in (3.28), the expression for the element of volume 
dV = dx dy dz under the transformation x = x(u, v, w), y = y (u, v, w), Z = z(u, Vv, w) may 


be obtained using the Jacobian 
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Upper surface z — /(x, y) 


Bd y- gx) 
jj ACER RR en 
y-860) 
Projection of 
volume onto 
(x, y) plane 
Figure 3.36 Partition of region T into Figure 3.37 The volume integral in terms of rectangular 
volume elements AV;. cartesian coordinates. 
dx ду д: 
ди ди ди 
у- дуг) _|дх ду д: 
O(u,v,w) |dv до до 
dx ду dz 
dw dw ðw 
as 
dV = dx dy dz = |J |du dv dw (3.33) 
For example, in the case of cylindrical polar coordinates 
x= pcos ¢, y-psin 6, z-z 
cos@  sinó O 
J-pi|-sinó cosó 0|-p 
0 0 1 
so that 
dV = pdp dọ dz (3.34) 
a result illustrated in Figure 3.38. 
Similarly, for spherical polar coordinates (r, 0, $) 
x —r sin 0 cos Q, y=rsin@sin@, z=rcos0 
510 Ө соѕ ф sin 0 sinó cos 0 
J-|rcos0cosQ rcosOsinó -rsinO| — r^ sing 
-rsinOsinó rsinOcosó 0 
so that 
dV = r° sin 0 dr d0 dọ (3.35) 


a result illustrated in Figure 3.39. 


Example 3.28 


Solution 


ріапех+у+2= 1 





Ппех+у= 1, 
2=0 


Figure 3.40 
Tetrahedron for 
Example 3.28. 


Example 3.29 


Solution 
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Figure 3.38 Volume element in Figure 3.39 Volume element in spherical 
cylindrical polar coordinates. polar coordinates. 


Find the volume and the coordinates of the centroid of the tetrahedron defined by x > 0, 
у>0,2>0апайх+у+27 < 1. 


The tetrahedron is shown in Figure 3.40. Its volume is 


х=1 у=1-х 2=1-х-у 
У = | ваг | «| Z dz 
х=0 у=0 z=0 


tetrahedron 


1 1-х 1 
-| «| пува аке) 
0 0 0 


Let the coordinates of the centroid be (x, y, z); then, taking moments about the line x = 0, 


Z=; 
xV = | xdV = | x dx dy dz 
tetrahedron tetrahedron 
1 1-х 1-x-y 1 
-| «| Z sic | ia cj si 
0 0 0 0 
Hence ¥ = +, and by symmetry p=Z= }. 


Find the moment of inertia of a uniform sphere of mass M and radius a about a diameter. 


A sphere of radius a has volume 4787/3, so that its density is 3M/4rta^. Then the moment 
of inertia of the sphere about the z axis is 


І = 3M (x° + y?) dx dy dz 
4та 


5рһеге 
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In this example it is natural to use spherical polar coordinates, so that 


т= 3M. | (?sin'6) z^ sin6 dr de do 
4Ta 


sphere 


a т 2т 
= ЗМ) ar} sin'ode| ap = ZE CAHN) 
4nd jJ, { a 4Ta 


2 
= 2 Ма 


(m Evaluating triple integrals using MATLAB uses the command triplequad. For 
example, consider (see Example 3.28): 


ПИЕ 
| | | x dx dy dz 
0J 0 0 


Here we write the integrand as the in1ine function 

EE Mene CN s ce eM MET Mc DL ME ^ DS 
so that the command 

е тоела GF , © , NC PM P у 
returns the answer 

Шш = 00116 


This procedure could be slow because of the large number of points at which the 
integrand is evaluated. 


3.4.10 Exercises 


92 Evaluate the triple integrals 95 Evaluate fff, xyz dx dy dz, where V is the region 
bounded by the planes x = 0, y = 0, z = 0 and 


i ] x+y+z=1 
«| af «| xyz dz ? : 
0 0 1 96 


Sketch the region contained between the parabolic 
cylinders y 2 x? and x 2 y? and the planes z = 0 and 


2 [3 f4 
(b) | | | хуг? dz dy dx x+y+z=2. Show that the volume of the region 


aida may be expressed as the triple integral 


l f4x f2-x-y 
93 Show that | | : | dz dy dx 
Os x 0 


1 2 x4x | 
| | «| (x+y+z)dy=0 and evaluate it. 
7 И B 97 Use spherical polar coordinates to evaluate 
94 Evaluate fff sin (x - y ^ z) dx dy dz over the жч дыд 
portion of the positive octant cut off by the plane 


х+у+2= т. y 


where V is the region in the first octant lying within 
the sphere x? - y? - z?- 1. 
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where V is the volume of the tetrahedron 
bounded by the planes x = 0, y = 0, z = 0 and 
x+y+z=l1. 


98 Evaluate f f f x^y?z?(x + y +z) dx dy dz throughout 
the region defined by x +y +z S lx 20,y 70, 100 Evaluate fff. yz dx dy dz taken throughout the 
py prism with sides parallel to the z axis, whose base 
. is the triangle with vertices at (0, 0, 0), (1, 0, 0), 
99 Show that if x + y +z = u, y + z = w and z = uvw (0, 1, 0) and whose top is the triangle with vertices 
then at (0, 0, 2), (1, 0, 1), (0, 1, 1). Find also the position 
of the centroid of this prism. 
OQ yz) _ ds 
O(u, v, w) 
101 Evaluate fff z dx dy dz throughout the region 
Hence evaluate the triple integral defined by x? +y? < z, x? +y? +z < 1,2 > 0. 
102 Using spherical polar coordinates, evaluate 


||| ехр[-(х + у + z)']dx dy dz 


V 


3.4.11 


fff x dx dy dz throughout the positive octant of 
the sphere x? + y? +z? = a°. 


Gauss’s divergence theorem 


In the same way that Green’s theorem relates surface and line integrals, Gauss’s theorem 


relates surface and volume integrals. 


Consider the closed volume V with surface area S shown in Figure 3.41. The surface 


integral ff, F -dS may be interpreted as the flow of a liquid with velocity field F(r) 
out of the volume У. In Section 3.3.1 we saw that the divergence of F could be 
expressed as 





Figure 3.41 
Closed volume 
with surface S. 


flow out of AV 
AV 


div F-V-F- lim 


AV—0 
In terms of differentials, this may be written 
div F dV = flow out of dV 


Consider now a partition of the volume V given by AV, (i 2 1,..., 7). Then the total 
flow out of V is the sum of the flows out of each AV;. That is, 


К lim x (flow out of AV;) = lim Ў, (div FAV;) 
поо ici n— oo mn 


S 


giving 


m 


S VÀ 


(3.36) 


This result is known as the divergence theorem or Gauss's theorem. It enables us 
to convert surface integrals into volume integrals, and often simplifies their evaluation. 
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Example 3.30 — A vector field F(r) is given by 
Е(г) = х?уі + х?у?ј + х?у2К 


Find ff, F -dS, where S is the surface of the region in the first octant for which 
x+y4+zZ<l. 


Figure 3.42 Region V 
and surface S for 
Example 3.30. 





Solution We begin by sketching the region V enclosed by S, as shown in Figure 3.42. It is clear that 
evaluating the surface integral directly will be rather clumsy, involving four separate 
integrals (one over each of the four surfaces). It is simpler in this case to transform it into 
a volume integral using the divergence theorem (3.36): 


|ва [| кау 


8 y 
Here 
div F 2 3x?y & 2x?y 4 x?y 2 6x?y 


and we obtain 


1 1-х 1-х-у 
96 Z 6x^y dz 
0 0 0 
S 


1 1-х 1-х-у 
= 6 | x dx | ydy | dz 
0 0 0 (see Example 3.28) 


1 1-х 
= sf saf [1 - 3)y - Y 1dy 


1 
-| x (1 = х) іх = E 


Example 3.31 Verify the divergence theorem 


EIE 


S V 


when F = 2xzi + yzj + zik and V is the volume enclosed by the upper hemisphere 
x +y +z =a, z> O. 


Figure 3.43 
Hemisphere for 
Example 3.31. 


Solution 
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The volume V and surface S of the hemisphere are illustrated in Figure 3.43. Note that 
since the theorem relates to a closed volume, the surface S consists of the flat circular 
base in the (x, y) plane as well as the hemispherical surface. In this case 


div Е = 22 +2 + 22 = 52 


so that the volume integral 1s readily evaluated as 


I 5z dx dy dz = | 5znr^ dz — | 5nz(a^ — z)dz - ina 
0 0 


V 


Considering the surface integral 


[esse | ла || ла 


S circular base hemisphere 

The unit normal to the base is clearly fi, 2 —K, so 
F-f,=-2° 

giving 


|| rase 


circular base 


since z = 0 on this surface. 
The hemispherical surface is given by 


Д, у, 2) = х жу +22 - а= 0 
so the outward unit normal fi, is 


â = Vf__ 2xi + 2yj + 2zk 
= = 
[Vf] 24x + y + z 


Since x? + y? + z’ = a° on the surface, 


Ap ==i+rt z+ ik 
a a a 
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giving 


2 2 3 2 
Е-й,= ?®2+72+ — 52,52,2427) 
а а а а а 


J| F- ñ, dS = J| 2 (x? + a’) dS 
a 


hemisphere hemisphere 


Hence 


since x° + y* +z?= a’ on the surface. Transforming to spherical polar coordinates, 
x-asin0cosó, z=acos0,  dS-asin0 d0 dó 
the surface integral becomes 
2n [mn/2 
Е.й,45 = “| | (зїп Ө соз Ө + зїп Ө соз Ө соз-ф) 40 аф 


0 0 
hemisphere 


2n 
“| [sin'0 + 15їп Ө соз ф]^40 


0 


2n 
= “| [; +} cos $]dó — ina! 


0 


thus confirming that 


|ва || аға 


5 V 


3.4.12 Stokes' theorem 


Stokes’ theorem 1s the generalization of Green's theorem, and relates line integrals in 
three dimensions with surface integrals. At the end of Section 3.3.3 we saw that the curl 


of the vector F could be expressed in the form 


a_a; flow round AS 
gerade 


In terms of differentials, this becomes 


curl F: dS = flow round dS 


can write 





Е. ағ = іт flow round AS;) = lim curl F- AS; 
Figure 3.44 Surface S } neo >, ( ) пЭ 2, 
bounded by curve C. C 


Consider the surface S shown in Figure 3.44, bounded by the curve C. Then the 
line integral $c F:dr can be interpreted as the total flow of a fluid with velocity field 
F around the curve C. Partitioning the surface S into elements AS; (i= 1,..., 7), we 


Ci 
Figure 3.45 Two 
paths, C, and C;, 
joining points A 
and B. 
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so that 


frar- ffc F)- ds (3.37) 


S 


This result is known as Stokes? theorem. It provides a condition for a line integral to 
be independent of its path of integration. For, if the integral [2 F-dr is independent of 
the path of integration then 


[rare [Por 


€ Cy 


where C, and C, are two different paths joining A and B as shown in Figure 3.45. Since 


[rare ға 


Gi = 


where — C, is the path C, traversed in the opposite direction, we have 


[rars | roe 


Ci -C 


That is, 


frazo 


С 


where C is the combined, closed curve formed from C, and —C,. Stokes’ theorem 
implies that if $c F -dr = 0 then 


|| (curl F) -dS = 0 


S 


for any surface S bounded by C. Since this is true for all surfaces bounded by C, we 
deduce that the integrand must be zero, that is curl F = 0. Writing Е = (А, Р, Е), ме 
then have that 


F -dr = F; dx + F, dy + F; dz 
is an exact differential if curl F = 0; that is, if 
OF, Әу ӘӘ. 2P ФР, 
oz Ox’ ðy Ox’ Oz oy 
Thus there is a function f(x, y, z) = f(r) such that 
that is, such that F(r) — grad f. 
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When F(r) represents a field of force, the field is said to be conservative (since it 
conserves rather than dissipates energy). When F(r) represents a velocity field for a 
fluid, the field is said to be curl-free or irrotational. 


Example 3.32 . Verify Stokes theorem for F — (2x — y)i — yz^j — y/zk, where S is the upper half of the 
sphere x? + y? + z* = 1 and C is its boundary. 


Figure 3.46 
Hemispherical 
surface and boundary 
for Example 3.32. 





x 


Solution The surface and boundary involved are illustrated in Figure 3.46. We are required to 


show that 
чан [| cut Fas 
С S 


Since C is a circle of unit radius in the (x, y) plane, to evaluate $,. F: dr, we take 
х = cos Q, y-sinó 
so that 
r — cos ji * sin 9j 
giving 
dr — —sin  dói * cos à dój 
Also, on the boundary C, z — 0, so that 
Е = (2х – у)і = (2 с05 ф – sin Q)i 
Thus 


frar- (2 cos 0 — sin ¢)i - (-sin ġi + cos ġj) dọ 


27 


-| (-2 sin @ cos @ + sin^à) dó -| [-sin 20 (1 t cos 20)] dó 


0 0 


= 
i j k 
д д д 
curl F = Әх 2 J |s k 


2x- y -yz -y'z 


103 


104 


105 


106 


107 


108 
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The unit outward-drawn normal at a point (x, y, z) on the hemisphere is given by 
(xi + yj + zk), since x? + y? + z’ = 1. Thus 


| аав |) k-(xi + yj + zk) dS 


S S 
B || 215 
S 
2п Гп/2 
| 0 
= 2л[; sin], 


п/2 
= Тт 


| cos 0 sin 0 dOdó 


Hence $, F-dr = ff;(curl F)-dS, and Stokes’ theorem is verified. 


3.4.13 Exercises 


Evaluate ff, F:dS, where F — (4Axz, —y?, yz) and S is 
the surface of the cube bounded by the planes x = 0, 
х= 1,у= 0, у= 1,2= 0 апіс = 1. 


Use the divergence theorem to evaluate the surface 


integral ff, F - dS, where F =xzi + yzj +z°k and S is 109 
the closed surface of the hemisphere x* + y?+z?=4, 
z > 0. (Note that you are not required to verify the 
theorem.) 
110 
Verify the divergence theorem 
fras- f| irar 
S y 
for F = 4xi — 2y?j + z?k over the region bounded by 
x’ +y’ =4,z=0andz=3. 
Еа г 111 
Prove that 
!!| (grad @) - (curl F) dV = || (F x grad à) - dS 
К $ 
Verify the divergence theorem for F = (xy +’ )i+x°yj 
and the volume in the first octant bounded by 
х= 0,у= 0,2= 0, 2= 1 апіх + у? = 4. 
112 


Use Stokes' theorem to show that the value of the 
line integral f^ F-dr for 


F = (36xz + 6y cos x, 3 + 6 sin x + z sin y, 
18x? — cos y) 


is independent of the path joining the points A and B. 


Use Stokes’ theorem to evaluate the line integral 
$c A:dr, where A = —yi + xj and C is the boundary 
of the ellipse x7/a’ + y7/b’ = 1, z=0. 


Verify Stokes’ theorem by evaluating both sides of 


[сч as = pra 


S С 


where F 7 (2x — y)i — yz?j — y?zk and S is the curved 
surface of the hemisphere x* + y? +z? = 16,2 2 0). 


By applying Stokes’ theorem to the function af(r), 
where a is a constant, deduce that 


| (п х grad f) dS =| fear 


S с 


Verify this result for the function /(r) 2 3x5? and 
the rectangle in the plane z = 0 bounded by the 
lines x 20, x 21, y 2 0 and y 2 2. 


Verify Stokes’ theorem for F = (2y + z, x — z, y — x) 
for the part of x* + y? +z? = | lying in the positive 
octant. 
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3.5 Епгоїпеепїпө арр!їсаїїоп: Streamlines in fluid dynamics 


As we mentioned in Section 3.1.5, differentials often occur in mathematical modelling 
of practical problems. An example occurs in fluid dynamics. Consider the case of 
steady-state incompressible fluid flow in two dimensions. Using rectangular cartesian 
coordinates (x, y) to describe a point in the fluid, let u and v be the velocities of the fluid 
in the x and y directions respectively. Then by considering the flow in and flow out of 
a small rectangle, as shown in Figure 3.47, per unit time, we obtain a differential 
relationship between u(x, y) and v(x, y) that models the fact that no fluid is lost or gained 
in the rectangle; that is, the fluid is conserved. 

The velocity of the fluid q is a vector point function. The values of its components 
u and v depend on the spatial coordinates x and y. The flow into the small rectangle in 


(x + Ax, y + Ay) 


Figure 3.47 
Fluid flow. 


Streamline 


de da 
v 


M 
Ax 


Figure 3.48 
Streamline. 





unit time is 


ux, y )Ay wx, y)Ax 


where x lies between x and x - Ax, and y lies between у апа у + Ay. Similarly, the flow 


out of the rectangle is 


и(х + Ax, Y)Ay - v(X, y - Ay)Ax 


where X lies between x and x *- Ax and y lies between y and y - Ay. Because no fluid is 
created or destroyed within the rectangle, we may equate these two expressions, giving 


u(x, y )Ay * v(x, y)Ax 2 u(x + Ax, Y )Ay + v(X, y + Ay)Ax 


Rearranging, we have 


и(х + Ах, ў) — u(x, y) is 


Ax 


v(x, y + Ay) — v(X, y) -0 
Ay 


Letting Ax — 0 and Ay — 0 gives the continuity equation 


ди Qv 
ae oy d 


The fluid actually flows along paths called streamlines so that there 1s no flow across 
a streamline. Thus from Figure 3.48 we deduce that 


v Ax = u Ay 
and hence 
vdx-— udy=0 


The condition for this expression to be an exact differential is 


л. 
2,69 x 
or 


gu, Qv. 
a a 


This is satisfied for incompressible flow since it is just the continuity equation, so that 
we deduce that there is a function y(x, y), called the stream function, such that 


Example 3.33 


Solution 
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u=- 


It follows that if we are given u and v, as functions of x and y, that satisfy the continuity 
equation then we can find the equations of the streamlines given by w(x, y) = constant. 


- 9v 
pe and 


Find the stream function y(x, y) for the incompressible flow that is such that the velocity 
q at the point (x, y) is 


(—у/(х* + у?), х/(х? + у?)) 


From the definition of the stream function, we have 


ду 
Ox 


u(x, y) = E and р(х, у) = 


provided that 


du , ov _ 
a 


Here we have 


-y x 
и = and v= 
2: 2 2 2 
x+y x+y 





so that 


ди _ ___2ух__ 
dy (x+y) 


ди | 2xy 2yx 


дх (х? + уг)? 
confirming that 


ди до 
a 


Integrating 
жа ыны 


дү _ ы = 
ду и x+y 


with respect to y, keeping x constant, gives 
у(х, у) = 5InG? * y?) * g(x) 
Differentiating partially with respect to x gives 
д d 
gr -> 4 €8 
ox x+y dx 
Since it is known that 
д 
oV. v(x,y)- = D 
дх X ty 
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Figure 3.49 
Streamline illustrating 
a vortex. 


we have 


dg o 
dx 


which on integrating gives 
в) = С 

where C is a constant. Substituting back into the expression obtained for y(x, y), we have 
VG, y) 7 jlInG? * y?) € C 


A streamline of the flow is given by the equation y(x, y) = k, where k is a constant. 
After a little manipulation this gives 


x +y =a@ and Ina=k-C 


and the corresponding streamlines are shown in Figure 3.49. This is an example of a 
vortex. 


УА 





Dh 
27 








3.6 Engineering application: Ш: ИБИ 


In modelling heat transfer problems we make use of three experimental laws. 


(1) Heat flows from hot regions to cold regions of a body. 

(2) The rate at which heat flows through a plane section drawn in a body is proportional 
to its area and to the temperature gradient normal to the section. 

(3) The quantity of heat in a body is proportional to its mass and to its temperature. 

In the simplest case we consider heat transfer in a medium for which the constants of 

proportionality in the above laws are independent of direction. Such a medium is called 


thermally isentropic. For any arbitrary region within such a medium we can obtain an 
equation that models such heat flows. The total amount Q(t) of heat within the region V is 


O(t) = |] cpu(r, t) dV 


V 
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where c is the specific heat of the medium, p is the density and u(r, t) is the temperature 
at the point r at time ¢. Heat flows out of the region through its bounding surface S. The 
experimental laws (1) and (2) above imply that the rate at which heat flows across an 
element AS of that surface is -kVu-AS, where k is the thermal conductivity of the 
medium. (The minus sign indicates that heat flows from hot regions to cold.) Thus the 
rate at which heat flows across the whole surface of the region is given by 


|| vos E «|l Vu-dS 


Using Gauss’s theorem, we deduce that the rate at which heat flows out of the region is 


-e [|| vuar 


V 


If there are no sources or sinks of heat within the region, this must equal the rate at which 
the region loses heat, — dO/dt. Therefore 


-4 | сри(т,)ау| = “|| V'udV 


V V 


Since 


d _ ||| ди 
КЫ 


V V 


this implies that 


[Jf (ere cp $4) av = 0 


V 


This models the situation for any arbitrarily chosen region V. The arbitrariness in the 
choice of V implies that the value of the integral is independent of V and that the 
integrand is equal to zero. Thus 


2, cp au 
Уи = t Ji 


The quantity k/cp is termed the thermal diffusivity of the medium and is usually 
denoted by the Greek letter kappa, x. The differential equation models heat flow within 
a medium. Its solution depends on the initial temperature distribution u(r, 0) and on 
the conditions pertaining at the boundary of the region. Methods for solving this equa- 
tion are discussed in Chapter 9. This differential equation also occurs as a model for 
water percolation through a dam, for neutron transport in reactors and in charge transfer 
within charge-coupled devices. We shall now proceed to obtain its solution in a very 
special case. 
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Example 3.34 A large slab of material has an initial temperature distribution such that one half is at 
—uy and the other at +u. Obtain a mathematical model for this situation and solve it, 
stating explicitly the assumptions that are made. 


Solution When a problem is stated in such vague terms, it is difficult to know what approxima- 
tions and simplifications may be reasonably made. Since we are dealing with heat 
transfer, we know that for an isentropic medium the temperature distribution satisfies 
the equation 


1 ди 
Уш = = 28 
г к дї 


throughout the medium. We know that the region we are studying is divided so that at 
і = 0 the temperature in one part is —u) while that in the other is +z, as illustrated in 
Figure 3.50. We can deduce from this figure that the subsequent temperature at a point 
in the medium depends only on the perpendicular distance of the point from the 
dividing plane. We choose a coordinate system so that its origin lies on the dividing 
plane and the x axis is perpendicular to it, as shown in Figure 3.51. Then the differential 
equation simplifies, since u(r, f) is independent of y and z, and we have 


[s (x « 0) 


+uo (x 2 0) 


EET 


PE : with u(x, 0) 
x кої 





utr, 0) 2 u(x, 0) 


--up 





Figure 3.50 Region for Example 3.34. Figure 3.51 Coordinate system for Example 3.34. 


Thinking about the physical problem also provides us with some further information. 
The heat flows from the hot region to the cold until (eventually) the temperature is 
uniform throughout the medium. In this case that terminal temperature is zero, 
since initially half the medium is at temperature +u,) and the other half at —uy. So we 
know that u(x, t) —> 0 as t — «e. We also deduce from the initial temperature distribution 
that —u, « u(x, f) « uj for all x and f, since there are no extra sources or sinks of heat 
in the medium. Summarizing, we have 


-uo (x <0) 


Ju gu (-оо < х < оо, 22 0) with рр 


2 
дх кд u(x,t) bounded for all x 
u(x,t) 20 as tee 
There are many approaches to solving this problem (see Chapter 9). One is to investig- 
ate the effect of changing the scale of the independent variables x and f. Setting x 2 AX 
and ѓ = uT, where À and Ии are positive constants, the problem becomes 
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g'U 090 
OX! KOT 
with U(X, T) = u(x, t) and U(X, 0) = uy sgn X. Choosing u = 4^, we see that 
2 
202190 with U(x, 0) =u sgn X 
ӘХ Т 


which implies that the solution u(x, £) of the original equation is also a solution of the 
scaled equation. Thus 


u(x, t) = и(Ах, А?) 


which suggests that we should look for a solution expressed in terms of a new variable 
s that is proportional to the ratio of x to Jt. Setting s = ax/\t, we seek a solution as a 
function of s: 


и(х, f) 2 ugf(s) 


This reduces the partial differential equation for wu to an ordinary differential equation 
for f, since 


du au df Pu awdf — Qu. laxwdf 

ox | qt ds | dx t ds” дї 2 tjt ds 
Thus the differential equation is transformed into 

adf. ax df 

t ds? 2K tut ds 
giving 

zs df 

ds? 2k ds 

Choosing the constant a such that a? — 1/(4«) reduces this to the equation 

2 

Vf. ydf 

ds ds 


The initial condition is transformed into two conditions, since for x « 0, s — —ce as 
t — 0 and for x 7 0, s — +% as t — 0. So we have 


f(s) 91 аѕ 5 ә о 
f(s)—-1 as s—-—ee 


Integrating the differential equation once gives 
if- A gU. where A is a constant 
S 


and integrating a second time gives 


fi Bea a 
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The integral occurring here is one that frequently arises in heat transfer problems, and 
is given a special name. We define the error function, erf(x), by the integral 


erf(x) = 2) e^ dz 


N 0 


Its name derives from the fact that it is associated with the normal distribution, which 

is a common model for the distribution of experimental errors (see Section 11.2.4). 

This is a well-tabulated function, and has the property that erf(x) — 1 as x — co. 
Writing the solution obtained above in terms of the error function, we have 


f(s) =A erf(s) +B 
Letting s — œ and s — —ee gives two equations for A and B: 
1=A+B 
-1=-A+B 
from which we deduce А = 1 апа В = 0. Thus 
f(s) = erf (s) 
so that 


x/2 Jt 
u(x, t) = ш et ( 2) - 2u | e" dz 
2\1 


3.7 Review exercises (1-21) 


1 Show that u(x, y) 2 x"f(t), t — y/x, satisfies the Hence deduce that the general solution of the 
differential equations equation is given by 
(ay 24 y duc u(x, y) = Дх + Зу) + в(х + 3») 
дх ду 

2 2 » where f and g are arbitrary functions. 

(b) x T + 2ху ди a у — = п(п = 1)и Find the solution of the differential equation 
дх дхду ду that satisfies the conditions 

Verify these results for the function 
u(x, y) 2 x! * y* 9 16x?y?. u(x, 0) = sin x, Е =3 cosx 


2 Find the values of the numbers a and b such that 


the change of variables u = x + ay, v = x + by 3 A саи С 2 Ci 00, ^ ae m 
; S г R(x, y, z) dz is exact if there is a function 
transforms the differential equation 
f(x, y, z) such that 
097097 970 

ax uM D P(x, y, z) dx + O(x, y, z) dy - R(x, y, z) dz 
into = Vf (dx, dy, dz) 

af 0 Show that this implies V x (P, Q, R) = 0. Deduce 





Juðv that curl grad f= 0. 


10 


И 


Find grad f, plot some level curves f = constant 
and indicate grad f by arrows at some points 
on the level curves for f(r) given by 


(b) xx? + y’) 


112 
(a) xy 


Show that if @ is a constant vector then 
(a) grad (O- r) 2 9 
(b) curl (@x r) =2@ ie 


(a) Prove that if f(r) is a scalar point function then 
curl grad f= 0 


(b) Prove that if v = grad [zf(r)] + af(r)k and 14 
V’f= 0, where @ is a constant and fis a 
scalar point function, then 


ШОШО ОО, Viv = grad( 222 ) 
2 2 


Show that if F = (x? — y? + х) – (2ху + у), 
then curl F = 0, and find f(r) such that 
F=gradf. 
Verify that ЙБ 


(2,1) 
| Е-ағ= [0162 


(1,2) 


A force F acts on a particle that is moving 

in two dimensions along the semicircle 

x= l- cos0, y=sinð (0 < 0 « m). 

Find the work done when 16 


(а) Е = (х? + у?) 
(b) F= (х? + у”) 


f being the unit vector tangential to the path. 1 


A force F = (xy, —y, 1) acts on a particle as it moves 
along the straight line from (0, 0, 0) to (1, 1, 1). 
Calculate the work done. 18 


The force F per unit length of a conducting wire 
carrying a current J in a magnetic field B is 
F = I x B. Find the force acting on a circuit 
whose shape is given by x = sin б, у = соз б, 

Sin 5 0, when current / flows in it and when 
it lies in a magnetic field B = xi — yj + k. 


The velocity v at the point (x, y) in a ШӘ 
two-dimensional fluid flow is given by 
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v - (yi — xj (X? 4 y?). Find the net 
circulation around the square x = +1, у = +1. 


A metal plate has its boundary defined by 
x = 0, у = х//с and y = c. The density at the 
point (x, y) is kxy (per unit area). Find the 
moment of inertia of the plate about an axis 
through (0, 0) and perpendicular to the plate. 


A right circular cone of height / and base radius 
a is cut into two pieces along a plane parallel to 
and distance c from the axis of the cone. Find the 
volume of the smaller piece. 


The axes of two circular cylinders of radius a 
intersect at right angles. Show that the volume 
common to both cylinders may be expressed as 
the triple integral 


а ((а2-у?) ((а2-у?) 
8| dy dx dz 
0 0 0 


and hence evaluate it. 


The elastic energy of a volume V of material 
is q'V/2EI), where q is its stress and E and I 
are constants. Find the elastic energy of a 
cylindrical volume of radius r and length / in 
which the stress varies directly as the distance 
from its axis, being zero at the axis and q at the 
outer surface. 


The velocity of a fluid at the point (x, y, z) has 
components (3x?y, xy?, 0). Find the flow rate out 
of the triangular prism bounded by z = 0, 2 = 1, 
x=0,y=Oandx+y=1. 


An electrostatic field has components 
(2xy, —y?, x + y) at the point (x, y, z). Find the total 
flux out of the sphere x? + y? +z? =a’. 


Verify Stokes’ theorem 


bra = J| (curl F)- dS 


E S 
where F = (x? + y — 4, 3xy, 2xz * z?) and S is 
the surface of the hemisphere x? + y? + z? = 16 
above the (x, y) plane. 


Use the divergence theorem to evaluate the 
surface integral 
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20 


Е 


5 


where a = xi + yj — 2zk and S is the surface of 
the sphere x? + y? + z? = œ? above the (x, y) 
plane. 


Evaluate the volume integral 


Е 


y 
where V denotes the wedge-shaped region 
bounded in the positive octant by the four 
planes x = 0, y=0, y= 1 —x andz=2 —x. 


21 


Continuing the analysis of Section 3.5, show that 
the net circulation of fluid around the rectangular 
element shown in Figure 3.47 is given by 

[u(x, y + Ay) — ua, y)]Ax 

— [#@ + Ах, у) – их, у)]Ау 
Deduce that if the fluid motion is irrotational at 
(x, y), then 


gu Qv. 
dy Ox | 


Show that for irrotational incompressible flow, 
the stream function y satisfies Laplace equation 


2 2 
_ Д аша 


Ox ду 


W 4 
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258 FUNCTIONS OF A COMPLEX VARIABLE 


4.1 


Introduction 


In the theory of alternating currents, the application of quantities such as the complex 
impedance involves functions having complex numbers as independent variables. There 
are many other areas in engineering where this is the case; for example, the motion of 
fluids, the transfer of heat or the processing of signals. Some of these applications are 
discussed later in this book. 

Traditionally, complex variable techniques have been important, and extensively used, 
in a wide variety of engineering situations. This has been especially the case in areas 
such as electromagnetic and electrostatic field theory, fluid dynamics, aerodynamics 
and elasticity. With the development of computer technology and the consequential 
use of sophisticated algorithms for analysis and design in engineering there has, over 
the last two decades or so, been less emphasis on the use of complex variable tech- 
niques and a shift towards numerical techniques applied directly to the underlying full 
partial differential equations model of the situation being investigated. However, even 
when this 1s the case there is still considerable merit in having an analytical solution, 
possibly for an idealized model, in order both to develop better understanding of 
the behaviour of the solution and to give confidence in the numerical estimates for the 
solution of enhanced models. Many sophisticated software packages now exist, many 
of which are available as freeware, downloadable from various internet sites. The older 
packages such as FLUENT and CFX are still available and still in use by engineering 
companies to solve problems such as fluid flow and heat transfer in real situations. The 
finite element package TELEMAC is modular in style and is useful for larger-scale 
environmental problems; these types of software programs use a core plus optional 
add-ons tailored for specific applications. The best use of all such software still requires 
knowledge of mappings and use of complex variables. One should also mention the 
computer entertainment industry which makes use of such mathematics to enable 
accurate simulation of real life. The kind of mappings that used to be used extensively 
in aerodynamics are now used in the computer games industry. In particular the ability 
to analyse complicated flow patterns by mapping from a simple geometry to a complex 
one and back again remains very important. Examples at the end of the chapter illus- 
trate the techniques that have been introduced. Many engineering mathematics texts 
have introduced programming segments that help the reader to use packages such as 
MATLAB or MAPLE to carry out the technicalities. This has not been done in this 
chapter since, in the latest version of MAPLE, the user simply opens the program 
and uses the menu to click on the application required (in this chapter a derivative or 
an integral), types in the problem and presses return to get the answer. Students are 
encouraged to use such software to solve any of the problems; the understanding of 
what the solutions mean is always more important than any tricks used to solve what 
are idealized problems. 

Throughout engineering, transforms in one form or another play a major role in anal- 
ysis and design. An area of continuing importance is the use of Laplace, z, Fourier and 
other transforms in areas such as control, communication and signal processing. Such 
transforms are considered later in the book where it will be seen that complex variables 
play a key role. This chapter is devoted to developing understanding of the standard 
techniques of complex variables so as to enable the reader to apply them with confidence 
in application areas. 


fc) 
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2 Mapping s 


Figure 4.1 Real 
mapping y = f(x). 


Figure 4.2 Complex 
mapping w = f(z). 
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Complex functions and mappings 


The concept of a function involves two sets X and Y and a rule that assigns to each 
element x in the set X (written x € X) precisely one element y € Y. Whenever this 
situation arises, we say that there is a function f that maps the set X to the set Y, and 
represent this symbolically by 


y-f() (xeX) 


Schematically we illustrate a function as in Figure 4.1. While x can take any value in 
the set X, the variable y = fix) depends on the particular element chosen for x. We therefore 
refer to x as the independent variable and y as the dependent variable. The set X is 
called the domain of the function, and the set of all images y — f(x) (x € X) is called 
the image set or range of f. Previously we were concerned with real functions, so that 
x and y were real numbers. If the independent variable is a complex variable z = x + jy, 
where x and y are real and j = /(-1), then the function f(z) of z will in general also be 
complex. For example, if f(z) = z’ then, replacing z by x + jy and expanding, we have 


Д@) = (х + ]у) = -y)+j2xy=u+jv (say) 
where u and v are real. Such a function f(z) is called a complex function, and we write 


w —f(z) 


where, in general, the dependent variable w = u + jv is also complex. 

The reader will recall that a complex number z = x + jy can be represented on a plane 
called the Argand diagram, as illustrated in Figure 4.2(a). However, we cannot plot 
the values of x, y and f(z) on one set of axes, as we were able to do for real functions 
y — f(x). We therefore represent the values of 


w=f(z)=utjv 


on a second plane as illustrated in Figure 4.2(b). The plane containing the independent 
variable z is called the z plane and the plane containing the dependent variable w is 
called the w plane. Thus the complex function w = f(z) may be regarded as a mapping 
or transformation of points P within a region in the z plane (called the domain) to 
corresponding image points P" within a region in the w plane (called the range). 

It is this facility for mapping that gives the theory of complex functions much of its 
application in engineering. In most useful mappings the entire z plane is mapped onto 
the entire w plane, except perhaps for isolated points. Throughout this chapter the 
domain will be taken to be the entire z plane (that is, the set of all complex numbers, 
denoted by C). This is analogous, for real functions, to the domain being the entire real 
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Example 4.1 


Solution 


Figure 4.3 
The mapping of 
Example 4.1. 


line (that is, the set of all real numbers R). If this is not the case then the complex 
function is termed ‘not well defined’. In contrast, as for real functions, the range of the 
complex function may well be a proper subset of C. 


Find the image in the w plane of the straight line y = 2x + 4 in the z plane, z= x + jy, 
under the mapping 


у= 22+ 6 


Writing w = u + jv, where u and v are real, the mapping becomes 
w=u+jv=2(x+jy)+6 
or 
u + jv = (2x + 6) + j2y 
Equating real and imaginary parts then gives 
u=2x+6, v=2y (4.1) 
which, on solving for x and y, leads to 
х= }(u-6), у= ір 
Thus the image of the straight line 
y=2x+4 
in the z plane is represented by 
50=2х 1(и-6) +4 
ог 
о= 2и – 4 


which corresponds to a straight line in the w plane. The given line in the z plane and the 
mapped image line in the w plane are illustrated in Figures 4.3(a) and (b) respectively. 

Note from (1.1) that, in particular, the point P,(—2 + j0) in the z plane is mapped to 
the point P/(2 + j0) in the w plane, and that the point P,(0 + j4) in the z plane is mapped 
to the point P3(6 + j8) in the w plane. Thus, as the point P moves from P, to P, along 





(a) z plane (b) w plane 


4.2.1 
Figure 4.4 
The degenerate 
mapping w — f. 
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the line y = 2x + 4 in the z plane, the mapped point P’ moves from P; to P; along the 
line v = 2u — 4 in the w plane. It is usual to indicate this with the arrowheads as 
illustrated in Figure 4.3. 


Linear mappings 


The mapping w = 2z + 6 in Example 4.1 is a particular example of a mapping cor- 
responding to the general complex linear function 


w=az+B (4.2) 
where w and z are complex-valued variables, and a and f are complex constants. In this 
section we shall investigate mappings of the z plane onto the w plane corresponding to 


(4.2) for different choices of the constants a and [. In so doing we shall also introduce 
some general properties of mappings. 


Case (a) a=0 
Letting @= 0 (or a= 0 + j0) in (4.2) gives 
и =Й 


which implies that w — f, no matter what the value of z. This is quite obviously a 
degenerate mapping, with the entire z plane being mapped onto the one point w = B 
in the w plane. If nothing else, this illustrates the point made earlier in this section, 
that the image set may only be part of the entire w plane. In this particular case the 
image set is a single point. Since the whole of the z plane maps onto w = P, it follows 
that, in particular, z = B maps to w = f. The point f is thus a fixed point in this 
mapping, which is a useful concept in helping us to understand a particular mapping. 
A further question of interest when considering mappings is that of whether, given a 
point in the w plane, we can tell from which point in the z plane it came under the 
mapping. If it is possible to get back to a unique point in the z plane then it is said to 
have an inverse mapping. Clearly, for an inverse mapping z = g(w) to exist, the point 
in the w plane has to be in the image set of the original mapping w = f(z). Also, from 
the definition of a mapping, each point w in the w plane image set must lead to a single 
point z in the z plane under the inverse mapping z = g(w). (Note the similarity to the 
requirements for the existence of an inverse function f^ (x) of a real function f(x).) For 
the particular mapping w = p considered here the image set is the single point w = [В іп 
the w plane, and it is clear from Figure 4.4 that there is no way of getting back to just 
a single point in the z plane. Thus the mapping w — f) has no inverse. 


Mapping w = B 
——>» 





z plane y 


z plane w plane 
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Figure 4.5 
The mapping 
w=(1+j)z. 


Case (b) 820, a4#0 
With such a choice for the constants o and D, the mapping corresponding to (4.2) becomes 
w= az 
Under this mapping, the origin is the only fixed point, there being no other fixed points 
that are finite. Also, in this case there exists an inverse mapping 
1 


z= -wW 
a 


that enables us to return from the w plane to the z plane to the very same point 
from which we started under w = az. To illustrate this mapping at work, let us choose 
œ= l +j, so that 


w-(l-jz (4.3) 
and consider what happens to a general point z, in the z plane under this mapping. In 
general, there are two ways of doing this. We can proceed as in Example 4.1 and split 
both z and w into real and imaginary parts, equate real and imaginary parts and hence 
find the image curves in the w plane to specific curves (usually the lines Re(z) = con- 
stant, Im(z) = constant) in the z plane. Alternatively, we can rearrange the expression 
for w and deduce the properties of the mapping directly. The former course of action, 
as we shall see in this chapter, is the one most frequently used. Here, however, we shall 
take the latter approach and write œ = 1 + j in polar form as 

1+j=2e"" 

Then, if 

z- re? 

in polar form it follows from (4.3) that 


= гү2е%#'® 44) 


We can then readily deduce from (4.4) what the mapping does. The general point in the 
z plane with modulus r and argument 0 1s mapped onto an image point w, with modulus 
742 and argument 0 1л in the w plane as illustrated in Figure 4.5. 
It follows that in general the mapping 
w= az 


maps the origin in the z plane to the origin in the w plane (fixed point), but effects an expan- 
sion by || and an anticlockwise rotation by arg o. Of course, arg o need not be positive, 
and indeed it could even be zero (corresponding to o being real). The mapping can be loosely 
summed up in the phrase ‘magnification and rotation, but no translation’. Certain geometrical 


y=Im (z) у= т (и) 


м= (1+ })2 
rv2 


0 Ө+їт 
о x = Re (z) О и = Ве (и) 


z plane w plane 
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Figure 4.6 
The mapping 
w-áós f. 


properties are also preserved, the most important being that straight lines in the z plane 
will be transformed to straight lines in the w plane. This is readily confirmed by noting 
that the equation of any straight line in the z plane can always be written in the form 


Iz - a|- Iz - P 


where a and b are complex constants (this being the equation of the perpendicular 
bisector of the join of the two points representing a and b on the Argand diagram). 
Under the mapping w = az, the equation maps to 


ad 


Ww 
a 





-|E-»| («0 
a 


or 
|w-aa|=|w-ba| 


in the w plane, which is clearly another straight line. 
We now return to the general linear mapping (4.2) and rewrite it in the form 


w-p-aoz 
This can be looked upon as two successive mappings: first, 
C=az 


identical to w = az considered earlier, but this time mapping points from the z plane to 
points in the € plane; secondly, 


w=C+ 8 (4.5) 
mapping points in the ¢ plane to points in the w plane. Elimination of ¢ regains equation 
(4.2). The mapping (4.5) represents a translation in which the origin in the ¢ plane is 
mapped to the point w — f in the w plane, and the mapping of any other point in the 
$ plane is obtained by adding [В to the coordinates to obtain the equivalent point in the 
w plane. Geometrically, the mapping (4.5) is as if the € plane is picked up and, without 
rotation, the origin placed over the point f. The original axes then represent the w plane 
as illustrated in Figure 4.6. Obviously all curves, in particular straight lines, are pre- 
served under this translation. 

We are now in a position to interpret (4.2), the general linear mapping, geometrically 
as a combination of mappings that can be regarded as fundamental, namely 


e translation 
e rotation, and 
e magnification 





that is, 
je je je 

——9 —— —— |a] — |ale"z+ B=az+B=w 

rotation magnification translation 
" wal+B 

———— 
op 

9 й 


Cplane, = €) + J w plane, у = и + jv 
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Example 4.2 


Solution 


It clearly follows that a straight line in the z plane is mapped onto a corresponding 
straight line in the w plane under the linear mapping w = az + f. A second useful 
property of the linear mapping is that circles are mapped onto circles. To confirm this, 
consider the general circle 


Iz-zl-r 
in the z plane, having the complex number z, as its centre and the real number r as its 


radius. Rearranging the mapping equation w = az + B gives 


_w_ B 
z=t- © (a#0) 


so that 


2-=®-Ё-®=1(›- м) 
a a 07 
where w, = œZ + D. Hence 
[2-21 = ғ 
implies 
[и = мо = [017 


which is a circle, with centre wy given by the image of z, in the w plane and with radius 
|a|r given by the radius of the z plane circle magnified by |o]. 
We conclude this section by considering examples of linear mappings. 


Examine the mapping 
w=(1+jz+1-j 


as a succession of fundamental mappings: translation, rotation and magnification. 


The linear mapping can be regarded as the following sequence of simple mappings: 


јл /4 i jn/4 jn/4 : 
eZ ——> 2e" —> |\2e""z+1l-j=w 
rotation magnification translation 
anticlockwise by 42 0—1-j or 
by in (0,0)—(1,-1) 


Figure 4.7 illustrates this process diagrammatically. The shading in Figure 4.7 helps to 
identify how the z plane moves, turns and expands under this mapping. For example, 
the line joining the points 0 + j2 and 1 + j0 in the z plane has the cartesian equation 


iy+x= 1 

Taking w = u + jv and z = x + jy, the mapping 
w-(ltjztl-j 

becomes 


uctj-(u-pyx-jy)-1-jo(x-y-1)4*j(x-*y- 1) 


Figure 4.7 
The mapping 
w=(1+j)z+1-j. 


Example 4.3 
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z —» ejn/4z 









N2ei/4z 4 | —-j 
— 





|| 
Taf 
[| 


Equating real and imaginary parts then gives 


й 
| Ч 
4 
Ё] 








и=х-у+1, о=х+у- 1 
which on solving for x and y gives 
2x-u-cv, 2у=0-и+ 2 


Substituting for x and y into the equation і у + х = 1 then gives the image of this line in 
the w plane as the line 


3v+u=2 


which crosses the real axis in the w plane at 2 and the imaginary axis at 2 . Both lines 
are shown dashed, in the z and w planes respectively, in Figure 4.7. 


The mapping w = az + B (a, B constant complex numbers) maps the pointz 2 1 4 j 
to the point w = j, and the point z = 1 — j to the point w = —1. 


(a) Determine o and f. 


(b) Find the region in the w plane corresponding to the right half-plane Re(z) > 0 
in the z plane. 


(c) Find the region in the w plane corresponding to the interior of the unit circle 
|z| < 1 in the z plane. 


(d) Find the fixed point(s) of the mapping. 


In (b)- (d) use the values of o and f determined in (a). 
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Solution 


(a) 


(b) 


The two values of z and w given as corresponding under the given linear mapping 
provide two equations for o and f as follows: z = 1 + j mapping to w =j 
implies 

ј= 0(1+)) +В 

while z = 1 — j mapping to w = —1 implies 

-1-a(1- j)* B 

Subtracting these two equations in c and f) gives 

j*122o(1-4j) -o«(1—j) 

so that 


1+] 
12 


= Ше 


Substituting back for p then gives 
В=}- (1+о=ј- (1-0) =ј-1 
so that 
w=3(1-jz+j-1=(-j)Gz-) 


The best way to find specific image curves in the w plane is first to express 
2 (= х + jy) in terms of w (2 u * jv) and then, by equating real and imaginary parts, 
to express x and y in terms of u and v. We have 


w-zü-jGz-1) 
which, on dividing by 1 — j, gives 


TF = iz =l 

Taking w = u + jv and z = x + jy and then rationalizing the left-hand side, we have 
+1 +])= ;(к+]у)—1 

Equating real and imaginary parts then gives 

u—-v-x-2, и+о= у (4.6) 


The first of these can be used to find the image of x > 0. It is u — v = —2, which 
is also a region bordered by the straight line u — v 2 —2 and shown in Figure 4.8. 
Pick one point in the right half of the z plane, say z = 2, and the mapping gives 
w = 0 as the image of this point. This allays any doubts about which side of 
и — v = —2 corresponds to the right half of the z plane, x = 0. The two correspond- 
ing regions are shown ‘hatched’ in Figure 4.8. 


Note that the following is always true, although we shall not prove it here. If a 


curve cuts the z plane in two then the corresponding curve in the w plane also cuts 
the w plane in two, and, further, points in one of the two distinct sets of the z plane 
partitioned by the curve correspond to points in just one of the similarly partitioned 
sets in the w plane. 


Figure 4.8 
The mappings of 
Example 4.3. 


(c) 


(d) 
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In cartesian form, with z = x + jy, the equation of the unit circle |z| = 1 is 
x-yz21 


Substituting for x and y from the mapping relationships (4.6) gives the image of 
this circle as 


(и=0+ 2) + (и+ъ)* = 1 

ог 

и +1? + 2и – 20 + 2 =0 

which, on completing the squares, leads to 
(и+ 12 + (0-12 = 5 


As expected, this is a circle, having in this particular case centre (—1, 1) and 
radius /1. If x? -- y? « 1 then (u 1? * (v — 1)? « 1, so the region inside the 
circle |z| = 1 in the z plane corresponds to the region inside its image circle in 
the w plane. Corresponding regions are shown shaded in Figure 4.8. 


The fixed point(s) of the mapping are obtained by putting w =z in w= az + B, 
leading to 


z -(iz- (0 -j) 
that is, 


z=}z-$jz-1+j 


so that 
= =й 
2+2] 


is the only fixed point. 


One final point is in order before we leave this example. In Figure 4.8 the images of 


x= 0 and х^ + у” = 1 сап also be seen in the context of translation, rotation (the line in 
Figure 4.8 rotates about z = 2j) and magnification (in fact, shrinkage, as can be seen by 
the decrease in diameter of the circle compared with its image in the w plane). 
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4.2.2 Exercises 


Find in the cartesian form y = тх + c (m and c real 
constants) the equations of the following straight 
lines in the z plane, z = x + jy: 


(a) |z—2+j]=|z-j+3] 
(b) z+z* + 4j(z-z*) =6 


where * denotes the complex conjugate. 


Find the point of intersection and the angle of 
intersection of the straight lines 


|Iz-1-jļ=ļz-3+jl| 
Iz—1+j|=lz-3-j| 


The function w = jz + 4 — 3j is a combination of 
translation and rotation. Show this diagrammatically, 
following the procedure used in Example 4.2. Find 
the image of the line 6x + y = 22 (z =x + jy) in the 
w plane under this mapping. 


Show that the mapping w = (1 — j)z, where 
w-u-jvandz -x - jy, maps the region y > 1 
in the z plane onto the region u ^ v > 2 in the 
w plane. Illustrate the regions in a diagram. 


Under the mapping w = jz + j, where w = u + jv 
and z = x + jy, show that the half-plane x > 0 
in the z plane maps onto the half-plane v > 1 in the 
w plane. 


4.2.3 Inversion 


The inversion mapping is of the form 


w = 


N le 


For z = x + jy find the image region in the w plane 
corresponding to the semi-infinite strip x > 0, 
0< y <2 in the z plane under the mapping 

w =jz + 1. Illustrate the regions in both planes. 


Find the images of the following curves under 
the mapping 


w=(j+ \3)z+jy3—-1 
(a) y=0 (b) x=0 
(c) 3 -yz21 (@) х2+у?2 + 2р = 1 


where z = х + ју. 


The mapping w = az + D (a, B both constant 
complex numbers) maps the point z = 1 + j to 
the point w = j and the point z = —1 to the point 
wzlzcj. 


(a) Determine a and f. 

(b) Find the region in the w plane 
corresponding to the upper half-plane 
Im(z) > 0 and illustrate diagrammatically. 

(c) Find the region in the w plane corresponding to 
the disc |z| < 2 and illustrate diagrammatically. 

(d) Find the fixed point(s) of the mapping. 


In (b)- (d) use the values of o and D determined 
in (a). 


(4.7) 


and in this subsection we shall consider the image of circles and straight lines in the 
z plane under such a mapping. Clearly, under this mapping the image in the w plane of 


the general circle 


[2-21 = ғ 


in the z plane, with centre at z; and radius r, is given by 


= – 20| = Р 


| 1 





(4.8) 


but it is not immediately obvious what shaped curve this represents in the w plane. To 
investigate, we take w =u + jv and z) =X 9 + Ју in (4.8), giving 
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и = ји : 
2. 23 — Xo T JYo 
и +0 


=r 








Squaring we have 


2 2 
u 0 2 
( 2 ;- x) +( 2 z+) =r 
u tv u +v 


which on expanding leads to 











и? 2их 2 v 20у 2 2 
2 33 73. 3*9*0* —,; — 2. 2 E 
(u +v) u +v (u +v) (и +0) 
Or 
2,2 
u tv 2vyy —2UX 23 23 12 
53 82 8.8 ОУ MO J'Q 
(и +0) u tv 
so that 
(P * v5? — x6 — yg) + 2uxy — 2vy) = 1 (4.9) 


The expression is a quadratic in u and v, with the coefficients of i? and v? equal and no 
term in uv. It therefore represents a circle, unless the coefficient of i? ^ 7? is itself zero, 
which occurs when 


xetyo=zr’, or |z|-r 
and we have 
2uxy — 2uyj - 1 


which represents a straight line in the w plane. 


Summarizing, the inversion mapping w = 1/2 maps the circle |z — z)| = r in the z 
plane onto another circle in the w plane unless |z)| = 7, in which case the circle is 
mapped onto a straight line in the w plane that does not pass through the origin. 


When |zj| # 7, we can divide the equation of the circle (4.9) in the w plane by the 
factor r? — x$ — y, to give 
2хои с 2Vov B 1 


2 2 2 2 2 2. 2 2 2 
Po— Xo — yo F = Xo Yo FP — Xo — Yo 


u’ +u + 
which can be written in the form 
(u — Y + (v — vo} = R 


where (uo, Vo) are the coordinates of the centre and R the radius of the w plane circle. It 
is left as an exercise for the reader to show that 


See Os ee pees 
(uo, Vo) -( 2 2? 2 г), К = 2 2 
r -|z| r- |zol r- |zo| 
Next we consider the general straight line 


[2- 41| = |2 - 0] 
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in the z plane, where a, and a, are constant complex numbers with a, # a,. Under the 
mapping (4.7), this becomes the curve in the w plane represented by the equation 


(4.10) 








Again, it is not easy to identify this curve, so we proceed as before and take 
wz-u- jv, a, —p-jq, a,=rt+js 


where p, q, r and s are real constants. Substituting in (4.10) and squaring both sides, we 


have 
2 2 2 2 
и U Е и и 
(= ;-») *(3 ;*4) - (5 ;-7) *(3 js) 
u +v u +v u +v u +v 


Expanding out each term, the squares of u/(i? - v?) and v/(i? + v?) cancel, giving 














__2ир +р?+ dud igs 2ur go g 2vs +5? 


и? +1” и +0 и +а? и +” 
which on rearrangement becomes 
(w+ 0)(p? + ф – т? — 8°) + 2u(r — p) + 2U(q - s) =0 (4.11) 
Again this represents a circle through the origin in the w plane, unless 
рР+ад = + 


which implies |a;| 7 |a|, when it represents a straight line, also through the origin, in 
the w plane. The algebraic form of the coordinates of the centre of the circle and its 
radius can be deduced from (4.11). 


We can therefore make the important conclusion that the inversion mapping 
w — l/z takes circles or straight lines in the z plane onto circles or straight lines in 
the w plane. Further, since we have carried out the algebra, we can be more 
specific. If the circle in the z plane passes through the origin (that is, | z;| =7 in (4.9) ) 
then it is mapped onto a straight line that does not pass through the origin in the w 
plane. If the straight line in the z plane passes through the origin (|a;| 7 |a;| in 
(4.11)) then it is mapped onto a straight line through the origin in the w plane. 
Figure 4.9 summarizes these conclusions. 


To see why this is the case, we first note that the fixed points of the mapping, deter- 
mined by putting w = z, are 


z- I or z«l 
2 
so that z = +1. 
We also note that z = 0 is mapped to infinity in the w plane and w = 0 is mapped to 
infinity in the z plane and vice versa in both cases. Further, if we apply the mapping a 
second time, we get the identity mapping. That is, if 


wst, and EE 
2 w 


Figure 4.9 
The inversion 
mapping w = 1/z. 
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z plane Mapping w = 1 w plane 
y v 
N / N 
VV p | 
y v 
——————— 
О x О и 
у v 
————— 
О x О и 
у U 
—————— 
О x О u 
then 
l 
g= =z 
1/z 


which is the identity mapping. 

The inside of the unit circle in the z plane, |z| < 1, is mapped onto |1/w| < 1 or 
|w| > 1, the outside of the unit circle in the w plane. By the same token, therefore, 
the outside of the unit circle in the z plane |z| > | is mapped onto | 1/w| > 1 or 
|w] < 1, the inside of the unit circle in the w plane. Points actually on |z| = 1 in the 
z plane are mapped to points on |w| = 1 in the w plane, with +1 staying fixed, as 
already shown. Figure 4.10 summarizes this property. 

It is left as an exercise for the reader to show that the top half-boundary of |z| = 1 is 
mapped onto the bottom half-boundary of | w| = 1. 

For any point z, in the z plane the point 1/z, is called the inverse of z, with respect 
to the circle |z| = 1; this is the reason for the name of the mapping. (Note the double 
meaning of inverse; here it means the reciprocal function and not the ‘reverse’ 
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Figure 4.10 Mapping 
of the unit circle under 
уу = 1/2. 


Example 4.4 


Solution 





mapping.) The more general definition of inverse is that for any point z, in the z plane 
the point r?/z, is the inverse of z, with respect to the circle |z| = r, where r is a real 
constant. 


Determine the image path in the w plane corresponding to the circle |z — 3| = 2 in the 
z plane under the mapping w = 1/z. Sketch the paths in both the z and w planes and 
shade the region in the w plane corresponding to the region inside the circle in the 
z plane. 


The image in the w plane of the circle |z — 3| = 2 in the z plane under the mapping 
w — l/z is given by 


p 
w 


which, on taking w = u + jv, gives 





ае 


2 2 
u +v 


Squaring both sides, we then have 
2 = 2 
(zt Gal 
u tv u tv 


2 2 
u tv 6u 


(и? + vy ute 








or 


+5=0 





which reduces to 
1-6u+50’+v*)=0 
or 
(uc iyev-2£ 


Thus the image in the w plane is a circle with centre (2, 0) and radius 2. The cor- 
responding circles in the z and w planes are shown in Figure 4.11. 
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Figure 4.11 y p и 
The mapping of cy 
Example 4.4. 
в’ 
А С А 
I» > 
О 1 5 x O|C 1 u 
B 
z plane w plane 


Taking z = x + jy, the mapping w = 1/z becomes 





utjv= ed ILE 
x+jy x+y 
which, on equating real and imaginary parts, gives 


x -y 
cS 2? a 2 
X ty X ty 





We can now use these two relationships to determine the images of particular points 
under the mapping. In particular, the centre (3, 0) of the circle in the z plane is mapped 
onto the point u — 1, v — 0 in the w plane, which is inside the mapped circle. Thus, under 
the mapping, the region inside the circle in the z plane is mapped onto the region inside 
the circle in the w plane. 

Further, considering three sample points A(1 + j0), B(3 — j2) and C(5 + j0) on the 
circle in the z plane, we find that the corresponding image points on the circle in the w 
plane are A'(1, 0), B'(5, 2) and C'(2, 0). Thus, as the point z traverses the circle in the 
z plane in an anticlockwise direction, the corresponding point w in the w plane will also 
traverse the mapped circle in an anticlockwise direction as indicated in Figure 4.11. 


4.2.4 Bilinear mappings 
A bilinear mapping is a mapping of the form 


az tb 
= Саи 4.12 
E cz td ( ) 


where a, b, c and d are prescribed complex constants. It is called the bilinear mapping 
in z and w since it can be written in the form Awz + Bw + Cz + D= 0, which is linear 
in both z and w. 

Clearly the bilinear mapping (4.12) is more complicated than the linear mapping 
given by (4.2). In fact, the general linear mapping is a special case of the bilinear 
mapping, since setting c= 0 and d= | in (4.12) gives (4.2). In order to investigate the 
bilinear mapping, we rewrite the right-hand side of (4.12) as follows: 

a ad 
OT ds. DP 
^ cz4d | cz t d 
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so that 
а bc — ad 
= = + — 4.13 
P с Ё c(cz + d) nn) 


This mapping clearly degenerates to w = a/c unless we demand that bc — ad # 0. We 
therefore say that (4.12) represents a bilinear mapping provided the determinant 


b 
c d 





| = ad bc 


is non-zero. This is sometimes referred to as the determinant of the mapping. When 
the condition holds, the inverse mapping 


_-dwt+hb 
E£É————— 
cw-a 


obtained by rearranging (4.12), is also bilinear, since 
Е. b 


C —aà 


—-da-cbz0 





Renaming the constants so that A = a/c, U = be — ad, a = c? and B = cd, (4.13) 
becomes 


w=a+—4. 
az+ B 


and we can break the mapping down into three steps as follows: 


z,=az+B 
_ 1 
2= — 
21 

у= А+ Ш 


The first and third of these steps are linear mappings as considered in Section 4.2.1, 
while the second is the inversion mapping considered in Section 4.2.3. The bilinear 
mapping (4.12) can thus be generated from the following elementary mappings: 


1 


MAT p inversion — (yz 4- p 





- 02 I 
rotation translation 
and 
magnification 


— — 5 ge us 
magnification (yz + B translation az+ B 
and 
rotation 
We saw in Section 4.2.1 that the general linear transformation w = az + B does not 
change the shape of the curve being mapped from the z plane onto the w plane. Also, 
in Section 4.2.3 we saw that the inversion mapping w = 1/z maps circles or straight lines 
in the z plane onto circles or straight lines in the w plane. It follows that the bilinear 
mapping also exhibits this important property, in that it also will map circles or straight 
lines in the z plane onto circles or straight lines in the w plane. 
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Investigate the mapping 


„== 
2+1 





by finding the images in the w plane of the lines Re(z) = constant and Im(z) = constant. 
Find the fixed points of the mapping. 


Since we are seeking specific image curves in the w plane, we first express z in terms 
of w and then express x and y in terms of u and v, where z = x + jy and w = u + jv. 
Rearranging 





gives 


~ T-w 
Taking z = x + jy and w = u + jv, we have 
х+ју= І+и+јо 
l-u-jv 


= L+utjv l-utjv 
l-u-jv l-utyjv 


which reduces to 


2 2 
Sai Lu cM "e 2v 


Jesse 
(1-uyev ` (l-u +o 


Equating real and imaginary parts then gives 


D l-w-v 
qo (4.142) 
(1-и) +0 
2 
y= —— (4.14b) 
(1-и) +0 


It follows from (4.142) that the lines Re(z) = x = c,, which are parallel to the imaginary 
axis in the z plane, correspond to the curves 


as law av 
‚= 0-0 
(1-uy 4 


where c, is a constant, in the w plane. Rearranging this leads to 
с1(1= 20+ и? +12) = 1-12-10? 
or, assuming that 1 + с # 0, 


2c\u c,- 1 
и +02 2—— + 2 = 0 
lc, с +1 
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Figure 4.12 
The mapping 
w—(z- ly(z * 1). 


which, on completing squares, gives 


2 2 
ADM 
l+c, l+c, 


It is now clear that the corresponding curve in the w plane is a circle, centre (u = 
с\/(1 + су), v = 0) and radius (1 + ¢,)". 

In the algebraic manipulation we assumed that c, # —1, in order to divide by 1 + c4. 
In the exceptional case c; 2 —1, we have и = 1, and the mapped curve is a straight line 
in the w plane parallel to the imaginary axis. 

Similarly, it follows from (4.14b) that the lines Im(z) = y = c), which are parallel to 
the imaginary axis in the z plane, correspond to the curves 








ec 2v 
his шысы ы 
(1-uy «v 


where c, is a constant, in the w plane. Again, this usually represents a circle in the w 
plane, but exceptionally will represent a straight line. Rearranging the equation we have 
2v 


C2 


(1-и) +12 = 


provided that c, # 0. Completing the square then leads to 


2 
(u- D'«(»- 1) -i 
C? 


C2 


which represents a circle in the w plane, centre (u = 1, v = 1/c;) and radius l/c;. 

In the exceptional case c, = 0, v = 0 and we see that the real axis y = 0 in the z plane 
maps onto the real axis v = 0 in the w plane. 

Putting a sequence of values to c, and then to c,, say —10 to +10 in steps of +1, 
enables us to sketch the mappings shown in Figure 4.12. The fixed points of the map- 
ping are given by 


z-1 





= 


2+1 
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that 15, 
2=-1, ог 2= +] 


In general, all bilinear mappings will have two fixed points. However, although there 
are mathematically interesting properties associated with particular mappings having 
coincident fixed points, they do not impinge on engineering applications, so they only 
deserve passing reference here. 


Find the image in the w plane of the circle |z| = 2 in the z plane under the bilinear 
mapping 


ge] 
2+) 


Sketch the curves in both the z and w planes and shade the region in the w plane cor- 
responding to the region inside the circle in the z plane. 


Rearranging the transformation, we have 





so that the image in the w plane of the circle |z| = 2 in the z plane is determined by 


wed = (4.15) 





One possible way of proceeding now is to put w =u + jv and proceed as in Example 4.4, 
but the algebra becomes a little messy. An alternative approach is to use the property 
of complex numbers that | z,/z,| = |z, |/|Z,|, so that (4.15) becomes 


ljw*jl-2I1 -w| 
Taking w = u + jv then gives 
[-0+ (и + 1)1= 2101 и) – јо] 
which on squaring both sides leads to 
D (1-0 uy -4[(0 — uy +17] 
ог 
и +1? – Du 120 
Completing the square of the u term then gives 
(u — у +12 = 6 


indicating that the image curve in the w plane is a circle centre (u — 2, v — 0) and radius 
+. The corresponding circles in the z and w planes are illustrated in Figure 4.13. To 
identify corresponding regions, we consider the mapping of the point z= 0+ j0 
inside the circle in the z plane. Under the given mapping, this maps to the point 
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Figure 4.13 
The mapping 
м = (2 - jy * j). 


Example 4.7 


Solution 





„= ®=—1=-1+]0 
0+] 


in the w plane. It then follows that the region inside the circle | z| 2 2 in the z plane maps 
onto the region outside the mapped circle in the w plane. 





o 

Ie 
whe 

z 


z plane w plane 


An interesting property of (4.12) is that there is just one bilinear transformation that 
maps three given distinct points z,, z, and z; in the z plane onto three specified distinct 
points w,, w; and w; respectively in the w plane. It is left as an exercise for the reader 
to show that the bilinear transformation is given by 

(и — wi)(w: - w3) = (z 21)(2› — 23) (4.16) 
(w — w;)(w2 = wi) (2 23)(20 – 21) 


The right-hand side of (4.16) is called the cross-ratio of z;, z;, z; and z. We shall illus- 
trate with an example. 


Find the bilinear transformation that maps the three points z = 0, —j and —1 onto the 
three points w = j, 1, 0 respectively in the w plane. 


Taking the transformation to be 
_azt+b 
cz+d 


on using the given information on the three pairs of corresponding points we have 


‚_ а(0+Ь_Ь 4.17 
73044 d UAM 


1= 960+ 6 (4.17) 
c(-j) +d 
_ а(—1)+Ь 4.17 
(= сүа to 
From (4.17c) a = b; then from (4.17a) 


d=? sji ји 
j 


10 


ПД 


12 


13 
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and from (4.17b) c = ja. Thus 


.az*ta _lz+1_ 
jaz-ja jz-1 


zl 
z-1 





Alternatively, using (4.16) we can obtain 


(w-iDü-9). (-0)Ccj* 1) 
(w-0)(1-j)) £r Cog 0) 


or 





mem 


as before. 


4.2.5 Exercises 


Show that if z = x + jy, the image of the half-plane 
у > c (c constant) under the mapping w = 1/z is the 
interior of a circle, provided that c > 0. What is 
the image when c = 0 and when c < 0? Illustrate 
with sketches in the w plane. 


14 


Determine the image in the w plane of the circle 
34.i|[21 
z+ 4 i] = 


under the inversion mapping w = 1/z. 


Show that the mapping w = 1/z maps the circle 

[z а| 7 a, with a being a positive real constant, 

onto a straight line in the w plane. Sketch the 
corresponding curves in the z and w planes, 15 
indicating the region onto which the interior 

of the circle in the z plane is mapped. 


Find a bilinear mapping that maps z = 0 to w =j, 
z--jtow- l andz 2-1 to w — 0. Hence sketch 
the mapping by finding the images in the w plane 
ofthe lines Re(z) = constant and Im(z) = constant in 
the z plane. Verify that z = 1(j — 1)(-1 + 3) are 
fixed points of the mapping. 


The two complex variables w and z are related 
through the inverse mapping 


ze ii 16 
Z 
(a) Find the images of the points z = 1, 1 — j and 
0 in the w plane. 
(b) Find the region of the w plane corresponding 
to the interior of the unit circle |z| < 1 in the 
z plane. 


17 


(c) Find the curves in the w plane corresponding 
to the straight lines x = y and x + y= 1 in the 
z plane. 

(d) Find the fixed points of the mapping. 


Given the complex mapping 


_ 2+ 1 
2-1 





w 


where w = u + jv and z = x + j y, determine the 
image curve in the w plane corresponding to the 
semicircular arc x? + у? = 1 (x < 0) described from 
the point (0, —1) to the point (0, 1). 


(a) Map the region in the z plane (z = x + jy) that 
lies between the lines x = y and y = 0, with x < 0, 
onto the w plane under the bilinear mapping 


+] 

z3 
(Hint: Consider the point w = 2 to help identify 
corresponding regions.) 

(b) Show that, under the same mapping as in (a), 
the straight line 3x + y = 4 in the z plane 
corresponds to the unit circle | w| 2 1 in the 
w plane and that the point w = 1 does not 
correspond to a finite value of z. 


If w = (z — j)/(z + j), find and sketch the image in 
the w plane corresponding to the circle |z| 2 2 in the 
z plane. 


Show that the bilinear mapping 


10) 2 — Zo 
=e nen, 
@= 2 
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Example 4.8 
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where 0, is a real constant 0 « 6, « 2m, z, a fixed 
complex number and z% its conjugate, maps the 
upper half ofthe z plane (Im(z) — 0) onto the inside 
of the unit circle in the w plane (|w| « 1). Find the 
values of z) and @ if w = 0 corresponds to z = j and 
w — —] corresponds to z — ee. 


Show that, under the mapping 


„== 
2 +] 


4.2.6 Тһе mapping w = z? 


19 


circular arcs or the straight line through z = 0 and 
z= j in the z plane are mapped onto circular arcs 
or the straight line through w = 0 and w =] in the 
w plane. Find the images of the regions | z — i | < i 
and |z| € |z - j| in the w plane. 


Find the most general bilinear mapping that maps 
the unit circle |z| = 1 in the z plane onto the unit 
circle | w| 2 1 in the w plane and the point z = 2, іп 
the z plane to the origin w = 0 in the w plane. 


There are a number of other mappings that are used by engineers. For example, in 
dealing with Laplace and z transforms, the subjects of Chapters 5 and 6 respectively, 
we are concerned with the polynomial mapping 


W=da)t+azt+...+4a,Z" 


where ag, a;, . . 


с РО) 
Q(z) 


. , q, are complex constants, the rational function 


where P and Q are polynomials in z, and the exponential mapping 


w=ae 


where e = 2.71828..., the base of natural logarithms. As is clear from the bilinear 
mapping in Section 4.2.4, even elementary mappings can be cumbersome to analyse. 
Fortunately, we have two factors on our side. First, very detailed tracing of specific 
curves and their images is not required, only images of points. Secondly, by using com- 
plex differentiation, the subject of Section 4.3, various facets of these more complicated 
mappings can be understood without lengthy algebra. As a prelude, in this subsection 
we analyse the mapping w = z’, which is the simplest polynomial mapping. 


Investigate the mapping у = 2? Бу plotting the images on the w plane of the lines 


x = constant and y = constant in the z plane. 


Solution 


There is some difficulty in inverting this mapping to get z as a function of w, since 


square roots lead to problems of uniqueness. However, there is no need to invert here, 
for taking w = u + jv and z = x + jy, the mapping becomes 


w-2ucjv-(x-jyy - Q8 - y) + ]2ху 


which, on taking real and imaginary parts, gives 


u=x-y 


о = 2ху 


(4.18) 


Figure 4.14 
The mapping w = z’. 
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If x = a, a real constant, then (4.18) becomes 
и= 02 – у?, v=2ay 


which, on eliminating y, gives 


2 


и=о?— -— 
до? 
Or 
4ou 2 4a* – 1? 
so that 


v’ = 404 -— 407u 2 Ao? (à — u) 


This represents a parabola in the w plane, and, since the right-hand side must be 
positive, œ? = u so the ‘nose’ of the parabola is at uw = a on the positive real axis in 
the w plane. 

If y = В, a real constant, then (4.18) becomes 


и= х? — B’, v = 2xB 

which, on eliminating x, gives 
v 2 

и = ap - p 
Or 

Af? 2 v – 4В* 
so that 

v? - 4f'u + 48° = 48°(и + В?) 


This 1s also a parabola, but pointing in the opposite direction. The right-hand side, as 
before, must be positive, so that w > —B* and the ‘nose’ of the parabola is on the 
negative real axis. These curves are drawn in Figure 4.14. 





w plane 


282 FUNCTIONS OF A COMPLEX VARIABLE 


20 


21 


22 


We shall not dwell further on the finer points of the mapping w =z’. Instead, we note 
that in general it is extremely difficult to plot images of curves in the z plane, even the 
straight lines parallel to the axes, under polynomial mappings. We also note that we do 


not often need to do so, and that we have done it only as an aid to understanding. 

The exercises that follow should also help in understanding this topic. We shall then 
return to examine polynomial, rational and exponential mappings in Section 4.3.4, after 
introducing complex differentiation. 


4.2.7 Exercises 


Find the image region in the w plane corresponding 
to the region inside the triangle in the z plane having 
vertices at 0 + j0, 2 +j0 and 0 + j2 under the 
mapping w =z’. Illustrate with sketches. 


Find the images of the lines y = x and y = —x under 
the mapping w — z?. Also find the image of the 
general line through the origin y = mx. By putting 
m = tan 0,, deduce that straight lines intersecting at 
the origin in the z plane map onto lines intersecting 
at the origin in the w plane, but that the angle 
between these image lines is double that between 
the original lines. 


Consider the mapping w = z”, where n is an integer 
(a generalization of the mapping w = z°). Use the 
polar representation of complex numbers to show 
that 


(a) Circles centred at the origin in the z plane are 
mapped onto circles centred at the origin in the 
w plane. 


28 


Complex differentiation 


(b) Straight lines passing through the origin 
intersecting with angle @ in the z plane are 
mapped onto straight lines passing through the 
origin in the w plane but intersecting at an 
angle n6;. 


If the complex function 


1+ 22 
2 


w= 





is represented by a mapping from the z plane onto 
the w plane, find u in terms of x and y, and v in terms 
of x and y, where z = x + jy, w = u + jv. Find the 
image of the unit circle |z| 2 1 in the w plane. Show 
that the circle centred at the origin, of radius r, in 
the z plane (|z| 7 r) is mapped onto the curve 


ru Y py Y 2 
(+ J *(3 EL (r # 1) 
r+] r-l 


in the w plane. What kind of curves are these? What 
happens for very large r? 








The derivative of a real function f(x) of a single real variable x at x = Xp is given by the 
limit 


Гоо) = Ша nma 


E X — Xo 


Here, of course, x; is a real number and so can be represented by a single point on the 
real line. The point representing x can then approach the fixed x, either from the left or 
from the right along this line. Let us now turn to complex variables and functions 
depending on them. We know that a plane is required to represent complex numbers, 
so Z is now a fixed point in the Argand diagram, somewhere in the plane. The definition 
of the derivative of the function f(z) of the complex variable z at the point zọ will thus be 


4.3.1 
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yi dm [eed 


2—20 Z — Zo 


It may appear that if we merely exchange z for x, the rest of this section will follow 
similar lines to the differentiation of functions of real variables. For real variables 
taking the limit could only be done from the left or from the right, and the existence of 
a unique limit was not difficult to establish. For complex variables, however, the point 
that represents the fixed complex number z, can be approached along an infinite num- 
ber of curves in the z plane. The existence of a unique limit is thus a very stringent 
requirement. That most complex functions can be differentiated in the usual way is a 
remarkable property of the complex variable. Since z = x + jy, and x and y can vary 
independently, there are some connections with the calculus of functions of two real 
variables, but we shall not pursue this connection here. 

Rather than use the word ‘differentiable’ to describe complex functions for which a 
derivative exists, if the function f(z) has a derivative f’(z) that exists at all points of a 
region R of the z plane then f(z) is called analytic in R. Other terms such as regular or 
holomorphic are also used as alternatives to analytic. (Strictly, functions that have a 
power series expansion — see Section 4.4.1 — are called analytic functions. Since dif- 
ferentiable functions have a power series expansion they are referred to as analytic 
functions. However, there are examples of analytic functions that are not differentiable.) 


Cauchy-Riemann equations 

The following result is an important property of the analytic function. 
If z 2 x * jy and f(z) 2 u(x, y) * jv(x, y), and f(z) is analytic in some region R of the 
z plane, then the two equations 


Qu Qv д. ё 
ox Oy ду Әх 


known as the Cauchy-Riemann equations, hold throughout R. 


(4.19) 


It is instructive to prove this result. Since f’(z) exists at any point z, in R, 


Heys tim [eed 


2—20 Z — Zo 


where z can tend to z along any path within R. Examination of (4.19) suggests that 
we might choose paths parallel to the x direction and parallel to the y direction, since 
these will lead to partial derivatives with respect to x and y. Thus, choosing z — z) = Ax, 
a real path, we see that 


f'G)- lim = + Ах) zd 
Ax0 ^ 


X 


Since f(z) = u + jv, this means that 


Г) = Jim Hs + Лх, уо) t ju(xo * A Уо) – и(хо, уо) – јо(хо, yo) 
Xx x 
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or, on splitting into real and imaginary parts, 


f'a)- lim и(хо + Ах, уо) - и(хо, Yo) 4555 + Ах, уо) - 0(хо, yo) 
Ах Ах 


giving 


Ше „| ig (4.20) 
x х=х0,у=у0 


Starting again from the definition of f’(z,), but this time choosing z — z, = jAy for the 
path parallel to the y axis, we obtain 


eer a 


f (Zz) = lim Ау 


jAy0 
Once again, using f(z) = u + jv and splitting into real and imaginary parts, we see that 


f'G)o lim “® Yo + Ay) + jv(%o, Yo + AY) - и(ху, Yo) — jv%o» ‘a 
jAy>0 1 
a jAy 


= lim | Los Yo + Ау) — иби, yo). , vGro, yo * Ay) — v(xo, yo) 
Ay0 j Ay Ay 


giving 


(4.21) 


High = 5 1 ди ү +3] 


ј9у ду] yy 


Since f" (z;) must be the same no matter what path is followed, the two values obtained 
in (4.20) and (4.21) must be equal. Hence 


ди + ену ш. ‚ди | д 
je X ј ду ду I ду 


Equating real and imaginary parts then gives the required Cauchy-Riemann equations 


ди w Qv. Qu 
дх oy dx oy 

at the point z = z). However, 2, is an arbitrarily chosen point in the region R; hence the 

Cauchy-Riemann equations hold throughout R, and we have thus proved the required 

result. 

It is tempting to think that should we choose more paths along which to let z — z; 
tend to zero, we could derive more relationships along the same lines as the Cauchy— 
Riemann equations. It turns out, however, that we merely reproduce them or expressions 
derivable from them, and it is possible to prove that satisfaction of the Cauchy-Riemann 
equations (4.19) is a necessary condition for a function f(z) 7 u(x, y) t ju(x, y), z 2 x * Jy. 
to be analytic in a specified region. At points where f’(z) exists it may be obtained from 
either (4.20) or (4.21) as 


си ‚др 


и 
ОТ 
o. 


Ре) = 
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or 


RENT 
Гб) = | o» 


If z is given in the polar form z = r e° then 
f(z) 7 u(r, 8) * jv(r, 8) 
and the corresponding polar forms of the Cauchy-Riemann equations are 


Qu 1w  w_ lu 


4.22 
dr roe or r 00 29 
At points where f’(z) exists it may be obtained from either of 
ef du , .др 
, ае e а 4.2 
ХӘ =е o rie (am 
or 
(су = eie 199 i) 2 
/@=е E 90 г90 peu 


Verify that the function f(z) ^ z? satisfies the Cauchy-Riemann equations, and deter- 
mine the derivative f’(z). 


Since z = x + jy, we have 
Ло) =2 = (+ ју? = (2-0) + ј2ху 
so if f(z) = u(x, y)  ju(x, y) then 
u-x-y, >= 2ху 


giving the partial derivatives as 


ди ди 

Ойы ОН шый 
дх e ду y 
до до 
О 
a P ду А 


It is readily seen that the Cauchy-Riemann equations 
Qu Qv ди д 
ox ду ду ox 
are satisfied. 
The derivative f’(z) is then given by 


T. mW NT 
fupe equ 2х + ]2у = 22 


as expected. 


286 


Example 4.10 


Solution 


FUNCTIONS OF A COMPLEX VARIABLE 


Verify that the exponential function f(z) = e”, where œ is a constant, satisfies the 
Cauchy-Riemann equations, and show that f’(z) = ое. 


Л) = и+јо= е = е) = еб еј = е (соѕ ау + ј sin ay) 
so, equating real and imaginary parts, 

u=e cos ay, v=e*sinay 
The partial derivatives are 


ди 


v Р 
—=ae cos ay, OU ae™ sin ay 
ox Ox 

u : 
ди _ -—ae™ sin ay, ma ое cos бу 
ду ду 


confirming that the Cauchy—Riemann equations are satisfied. The derivative f’(z) is 
then given by 


f= ди + {97 = де (cos ay +jsin ay) = ae” 
Ox "Ox 
so that 
ое дет (4.24) 
dz 


As in the real variable case, we have (see Section 4.3.1) 


e* 2 cosz 4 jsinz (4.25) 


so that cosz and sinz may be expressed as 
(4.262) 


Using result (4.24) from Example 4.10, it is then readily shown that 
£ (sinz) = С052 
a (cos z) = —sinz 
dz 


Similarly, we define the hyperbolic functions sinhz and coshz by 


=z 


sinhz = Ê ~ =-jsinjz 





(4.26b) 


coshz = 


Ет 
2 
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from which, using (4.24), it is readily deduced that 


a (sinh z) = cosh z 
dz 


d osha = sinhe 
dz 


We note from above that e has the following real and imaginary parts: 
Re(e*) = e*cosy 
Im(e’) = e* siny 


In real variables the exponential and circular functions are contrasted, one being mono- 
tonic, the other oscillatory. In complex variables, however, the real and imaginary parts 
of e are (two-variable) combinations of exponential and circular functions, which 
might seem surprising for an exponential function. Similarly, the circular functions of 
a complex variable have unfamiliar properties. For example, it is easy to see that | cos z | 
and |sinz| are unbounded for complex z by using the above relationships between 
circular and hyperbolic functions of complex variables. Contrast this with |cosx| « 1 
and |sinx| « 1 for a real variable x. 

In a similar way to the method adopted in Examples 4.9 and 4.10 it can be shown 
that the derivatives of the majority of functions f(x) of a real variable x carry over to the 
complex variable case f(z) at points where f(z) is analytic. Thus, for example, 


n n-l 


d, 
——z"z-nz 
dz 


for all z in the z plane, and 


Sae i 
2 


dz 
for all z in the z plane except for points on the non-positive real axis, where Inz is 
non-analytic. 
It can also be shown that the rules associated with derivatives of a function of a real 


variable, such as the sum, product, quotient and chain rules, carry over to the complex 
variable case. Thus, 


2) «g()1- пава 
2 dz dz 


d 52) = fz) 8D + 92), 02) 
z dz dz 

d md е 

2 лаб) = 2188 


а Es] —gG)f'(z) — f(z)g'(z) 


dz g(z|. [gp 
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4.3.2 


Example 4.11 


Solution 


Conjugate and harmonic functions 


A pair of functions u(x, vy) and v(x, y) of the real variables x and y that satisfy the 
Cauchy—Riemann equations (4.19) are said to be conjugate functions. (Note here 
the different use of the word ‘conjugate’ to that used in complex number work, where 
z* = x — jy is the complex conjugate of z = x + jy.) Conjugate functions satisfy the 
orthogonality property in that the curves in the (x, y) plane defined by u(x, у) = constant 
and v(x, y) = constant are orthogonal curves. This follows since the gradient at any point 
on the curve u(x, y) = constant is given by 


Л] „д/м 
dx} ду] ox 


and the gradient at any point on the curve v(x, y) = constant is given by 


al -æj 
dx | ду! dx 


It follows from the Cauchy-Riemann equations (4.19) that 


ау 4 -z 
ах |, х |, 


so the curves are orthogonal. 
A function that satisfies the Laplace equation in two dimensions is said to be 
harmonic; that is, u(x, y) is a harmonic function if 


2 2 
pu " = - 

Ox ду 
It is readily shown (see Example 4.12) that if f(z) = u(x, y) + ju(, y) is analytic, so that 
the Cauchy-Riemann equations are satisfied, then both u and v are harmonic functions. 
Therefore и апа v are conjugate harmonic functions. Harmonic functions have applica- 


tions in such areas as stress analysis in plates, inviscid two-dimensional fluid flow and 
electrostatics. 


0 


Given u(x, y) 2 x? — y? + 2x, find the conjugate function v(x, y) such that f(z) = 
u(x, y) + ju(x, y) is an analytic function of z throughout the z plane. 


We are given u(x, y) 2 x? — y? ^ 2x, and, since f(z) = u + jv is to be analytic, the Cauchy— 
Riemann equations must hold. Thus, from (4.19), 


gv ди _ 
г ааа 


Integrating this with respect to y gives 
v = 2xy  2y + F(x) 


where F(x) is an arbitrary function of x, since the integration was performed holding 
x constant. Differentiating v partially with respect to x gives 


Example 4.12 


Solution 
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Ov dF 
Ой _4,4,08 
Әх Ишт 


but this equals —du/dy by the second of the Cauchy-Riemann equations (4.19). Hence 


du әу Е 
ду У ах 


But since u = xX – у? + 2x, Qu/dy — —2y, and comparison yields F(x) = constant. This 
constant is set equal to zero, since no conditions have been given by which it can be 
determined. Hence 


u(x, y) ^ ju(x, y) = y- y + 2x + j(2xy + 2y) 


To confirm that this is a function of z, note that f(z) is f(x + jy), and becomes just f(x) 
if we set y= 0. Therefore we set y = 0 to obtain 


f(x + j0) = f(x) = u(x, 0) + ju(x, 0) =x? + 2x 
and it follows that 
Да) = 2 + 22 


which can be easily checked by separation into real and imaginary parts. 


Show that the real and imaginary parts u(x, y) and v(x, y) of a complex analytic function 
f(z) are harmonic. 


Since 
JE) = ux, у) + јх, у) 
is analytic, the Cauchy-Riemann equations 
v. ди Qu др 
ox ду” дх ду 
are satisfied. Differentiating the first with respect to x gives 
dv du __ ди _ fæ) 
ox 





ӘХ Әхду ðyðx i ду 
which is —9?v/9y?, by the second Cauchy-Riemann equation. Hence 

2 2 2 2 

Әх ду Ox dy 

and v is a harmonic function. 
Similarly, 

Fu __ Ho 9 (30) Zu 
oy дх? 


ду? Oyox ox 


so that 
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дх ду 
and u is also a harmonic function. We have assumed that both u and v have continuous 
second-order partial derivatives, so that 

ди = ди av m дь 

oxdy  dyox Әхду дудх 


0 














24 


25 


26 


2 


28 


4.3.3 Exercises 


Determine whether the following functions are 
analytic, and find the derivative where appropriate: 


(a) ze (b) sin4z 


(c) zz* (d) cos2z 


Determine the constants a and b in order that 
у= 22 + ау? – 2ху + (р? – у> + 2ху) 


be analytic. For these values of a and 5 find the 
derivative of w, and express both w and dw/dz as 
functions of z = x + jy. 


Find a function v(x, y) such that, given u — 2x(1 — y), 
f(z) * u * jv is analytic in z. 


Show that $(x, y) 2 e'(x cos y — y sin y) is a harmonic 
function, and find the conjugate harmonic function 
у(х, у). Ме ф(х, у) - jw(x, y) as a function of 
z=x + jy only. 


Show that u(x, y) = sin x cosh y is harmonic. Find 
the harmonic conjugate v(x, y) and express w = u +jv 
as a function of z = x + jy. 


4.3.4 Mappings revisited 


DO 


30 


31 


32 


Find the orthogonal trajectories of the following 
families of curves: 


(a) xiy-xy!- a (constant o) 


(b) e*cosy -xy- « (constant o) 


Find the real and imaginary parts of the functions 
(a) ze” 
(b) sin 2z 


Verify that they are analytic and find their 
derivatives. 


Give a definition of the inverse sine function 
sin! z for complex z. Find the real and imaginary 
parts of sin™ z. (Hint: put z = sin w, split into 

real and imaginary parts, and with w = u + jv 
and z = x + jy solve for u and v in terms of x 

and y.) Is sin! z analytic? If so, what is its 
derivative? 


Establish that if z = x + jy, 
| sinh y| « |sinz| « cosh y. 


In Section 4.2 we examined mappings from the z plane to the w plane, where in the 
main the relationship between w and z, w — f(z) was linear or bilinear. There is an 
important property of mappings, hinted at in Example 4.8 when considering the map- 
ping и = z?. A mapping w = f(z) that preserves angles is called conformal. Under such 
a mapping, the angle between two intersecting curves in the z plane is the same as the 
angle between the corresponding intersecting curves in the w plane. The sense of the 
angle is also preserved. That is, if 0 1s the angle between curves 1 and 2 taken in the anti- 
clockwise sense in the z plane then 0 is also the angle between the image of curve 1 
and the image of curve 2 in the w plane, and it too is taken in the anticlockwise sense. 


Figure 4.15 
Conformal mappings. 


Example 4.13 


Solution 
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w zf(z) 
————— 
y (conformal) v 
curve 2 fiere d) 
0 
ү сигуе 1 f(curve 1) 
Рао) 





z plane w plane 


Figure 4.15 should make the idea of a conformal mapping clearer. If f(z) is analytic 
then w = f(z) defines a conformal mapping except at points where the derivative f’(z) 
is zero. 

Clearly the linear mappings 


w=@z+ß (a#0) 


are conformal everywhere, since dw/dz = œ and is not zero for any point in the z plane. 
Bilinear mappings given by (4.12) are not so straightforward to check. However, as we 
saw in Section 4.2.4, (4.12) can be rearranged as 





у= А B ,U#0 

s tar (os peu) 
Thus 

dw_ иа 

dz (az+ Bp) 


which again is never zero for any point in the z plane. In fact, the only mapping we have 
considered so far that has a point at which it is not conformal everywhere is w = z? 
(cf. Example 4.8), which is not conformal at z = 0. 


Determine the points at which the mapping w = z + 1/z is not conformal and demon- 
strate this by considering the image in the w plane of the real axis in the z plane. 


Taking z = x + jy and w = u + jv, we have 


x — jy 
x+y 





w=u+jv=x+jy+ 


which, on equating real and imaginary parts, gives 





u=x+ 
x+y 
У 


x+y 





v=y- 
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Figure 4.16 Image 
ofz-l-4£of 
Example 4.13. 


The real axis, y = 0, in the z plane corresponds to v = 0, the real axis in the w plane. 
Note, however, that the fixed point of the mapping is given by 


z2z4l 
z 
Or z — ee, From the Cauchy-Riemann equations it is readily shown that w is analytic 
everywhere except at z= 0. Also, dw/dz = 0 when 
1 


Desc, thatis z—-l 
2 


which are both on the real axis. Thus the mapping fails to be conformal at z 2 0 and 
2= +1. The image of z = 1 15 w = 2, and the image of z = —1 is w = —2. Consideration 
of the image of the real axis is therefore perfectly adequate, since this is a curve passing 
through each point where w =z + 1/z fails to be conformal. It would be satisfying if we 
could analyse this mapping in the same manner as we did with w 2 z? in Example 4.8. 
Unfortunately, we cannot do this, because the algebra gets unwieldy (and, indeed, our 
knowledge of algebraic curves is also too scanty). Instead, let us look at the image of 
the point z = 1 + €, where € is a small real number. € > 0 corresponds to the point Q 
just to the right of z 2 1 on the real axis in the z plane, and the point P just to the 
left of z 2 1 corresponds to ё < 0 (Figure 4.16). 





z plane 


Ifz- 1 + е еп 


w=1+e+ | 
EE 


=1+є&+(1+=)! 
=lt+et+l—-ete’?-e'+... 
=2+e& 


if | €| is much smaller than 1 (we shall discuss the validity of the power series expansion 
in Section 4.4). Whether £ is positive or negative, the point w = 2 + €? is to the right of 
w = 2 in the w plane as indicated by the point R in Figure 4.16. Therefore, as £ > 0, a 
curve (the real axis) that passes through z = 1 in the z plane making an angle 0 = m 
corresponds to a curve (again the real axis) that approaches w = 2 in the w plane along 
the real axis from the right making an angle @= 0. Non-conformality has thus been 
confirmed. The treatment of z = —1 follows in an identical fashion, so the details 
are omitted. Note that when y= 0 (v 2 0), u 2 x * l/x so, as the real axis in the z plane 
is traversed from x = —co to x = 0, the real axis in the w plane is traversed from 


w plane 


Figure 4.17 Image 
in w plane of the real 
axis in the z plane for 
Example 4.13. 


Example 4.14 


Solution 
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и = —% to —2 and back to u = —œ again (when x = —1, u reaches —2). As the real 
axis in the z plane is traversed from x = 0 through x = | to x = +æ, so the real axis in 
the w plane is traversed from u = +% to u = +2 (x = 1) back to u = œ again. Hence the 
points on the real axis in the w plane in the range —2 < u < 2 do not correspond to real 
values of z. Solving u = x + 1/x for x gives 


x- 1 [ш + (и? – 4)] 


which makes this point obvious. Figure 4.17 shows the image in the w plane of the real 
axis in the z plane. This mapping is very rich in interesting properties, but we shall not 
pursue it further here. Aeronautical engineers may well meet it again if they study the 
flow around an aerofoil in two dimensions, for this mapping takes circles centred at the 
origin in the z plane onto meniscus (lens-shaped) regions in the w plane, and only a 
slight alteration is required before these images become aerofoil-shaped. 


Examine the mapping 
w-e 


by (a) finding the images in the w plane of the lines x = constant and y = constant in 
the z plane, and (b) finding the image in the w plane of the left half-plane (x < 0) in the 
z plane. 


Taking z = x + jy and w = u * jv, for w 2 e we have 
u-e'cosy 
v-e'siny 

Squaring and adding these two equations, we obtain 
и +1? = е 


On the other hand, dividing the two equations gives 
P -tan y 
u 


We can now tackle the questions. 


(a) Since w? +v = e”, putting x = constant shows that the lines parallel to the imagin- 
ary axis in the z plane correspond to circles centred at the origin in the w plane. 
The equation 


2 = tany 
u 


shows that the lines parallel to the real axis in the z plane correspond to straight 
lines through the origin in the w plane (v = u tan & if y = œ, a constant). 
Figure 4.18 shows the general picture. 
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Figure 4.18 Mapping 
of lines under w = e’. 


у=1т (2) 








z plane w plane 
(b) Sincei? - v? 2 e?, if x 2 0 then i? - v? 2 1, so the imaginary axis in the z plane 
corresponds to the unit circle in the w plane. If x « 0 then e? < 1, and as x — —ee, 
е” — 0, so the left half of the z plane corresponds to the interior of the unit circle 
in the w plane, as illustrated in Figure 4.19. 
Figure 4.19 Mapping y А v 
w=e 
of half-plane under e dique 
w=e. 
> > 
о х -1 1 u 
z plane w plane 


33 


34 


35) 


4.3.5 Exercises 


Determine the points at which the following 36 
mappings are not conformal: 
(а) w-2z-1 (b) w222-212 * 72z4 6 
1 
(с) w2 8z * — 
2z 
Follow Example 4.13 for the mapping w — z — l/z. 37 


Again determine the points at which the mapping is 
not conformal, but this time demonstrate this by 
looking at the image of the imaginary axis. 


Find the region of the w plane corresponding to 
the following regions of the z plane under the 
exponential mapping w = e: 


(a) 0< x< œ% (6) 0=х = 1,0 <у<1 


y 


(c) т=у<л,0<х< оо 


Consider the mapping w = sinz. Determine the 
points at which the mapping is not conformal. 

By finding the images in the w plane of the 

lines x = constant and y = constant in the z plane 
(z =x + jy), draw the mapping along similar lines to 
Figures 4.14 and 4.18. 


Show that the transformation 
а? 
2=6+ 5 
б 


where z = x + jy and & 2 R e? maps a circle, with 
centre at the origin and radius a, in the € plane, onto 
a straight line segment in the z plane. What is the 
length of the line? What happens if the circle in the 
Č plane is centred at the origin but is of radius b, 
where b + a? 
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me Complex series 


4.4.1 


In Modern Engineering Mathematics we saw that there were distinct advantages in being 
able to express a function f(x), such as the exponential, trigonometric and logarithmic 
functions, of a real variable x in terms of its power series expansion 


Хх) = ` a,x" = a+ ax + ayx^ t... ax v... (4.27) 


n=0 


Power series are also very important in dealing with complex functions. In fact, any real 
function f(x) which has a power series of the form in (4.27) has a corresponding com- 
plex function f(z) having the same power series expansion, that is 


Р) = £ а„2" =а,+ау+а,2°+...+а„2'+... (4.28) 
п=0 


This property enables us to extend real functions to the complex case, so that methods 
based on power series expansions have a key role to play in formulating the theory of 
complex functions. In this section we shall consider some of the properties of the power 
series expansion of a complex function by drawing, wherever possible, an analogy with 
the power series expansion of the corresponding real function. 


Power series 


A series having the form 


> а, (2 2)" = а, + а (2 2) + а(2- 20) +... +a (Z-z) +... (4.29) 


п=0 


in which the coefficients a, are real or complex and z, is a fixed point in the complex 
z plane is called a power series about z, or a power series centred on zy. Where z, — 0, 
the series (4.29) reduces to the series (4.28), which is a power series centred at the 
origin. In fact, on making the change of variable z’ = z — Zp, (4.29) takes the form (4.28), 
so there is no loss of generality in considering the latter below. 

Tests for the convergence or divergence of complex power series are similar to those 
used for power series of a real variable. However, in complex series it is essential that 
the modulus |a,| be used. For example, the geometric series 


— п 
2: 


n=0 


has a sum to N terms 





e 1 -z" 
S = ne 
i 2 l-z 


and converges, if|z| < 1, to the limit 1/(1 — z) as N —> ee. If |z| = 1, the series diverges. 
These results appear to be identical with the requirement that |x| < 1 to ensure con- 
vergence of the real power series 
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Figure 4.20 
Region of 
convergence 
of X, oz". 





п=0 
However, in the complex case the geometrical interpretation is different in that the 
condition |z| « 1 implies that z lies inside the circle centred at the origin and radius 1 
in the z plane. Thus the series X, oz" converges if z lies inside this circle and diverges 
if z lies on or outside it. The situation is illustrated in Figure 4.20. 


y 
pu zi 
^ 





z plane 


The existence of such a circle leads to an important concept in that in general there 
exists a circle centred at the origin and of radius R such that the series 


Y a z" J} converges if |z| < R 
: divergesif |z| > R 


The radius R is called the radius of convergence of the power series; what happens 
when |z| = R is normally investigated as a special case. 

We have introduced the radius of convergence based on a circle centred at the 
origin, while the concept obviously does not depend on the location of the centre of 
the circle. If the series 1s centred on z, as in (4.29) then the convergence circle would 
be centred on z,. Indeed it could even be centred at infinity, when the power series 
becomes 


а, 
У а=" =-а+ 214+ m Быны ы. 
= z 2 2 


which we shall consider further in Section 4.4.5. 

In order to determine the radius of convergence R for a given series, various tests for 
convergence, such as those introduced in Modern Engineering Mathematics for real 
series, may be applied. In particular, using d’ Alembert’s ratio test, it can be shown that 
the radius of convergence R of the complex series X, o a,z" is given by 





R= lim 


noo 


(4.30) 








аһ 


provided that the limit exists. Then the ѕегіеѕ 15 convergent within the disc |z| < R. 
In general, of course, the limit may not exist, and in such cases an alternative method 
must be used. 


Example 4.15 


Solution 
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Find the power series, in the form indicated, representing the function 1/(z — 3) in the 
following three regions: 


(а) Ill«5 Ya 
п=0 


оо 


(b |2—2|<1; $ aE- 2)" 


n=0 


Р 
© а>: ye 


п=0 


and sketch these regions on an Argand diagram. 


We know that the binomial series expansion 


(192^ 2 1 nz 4 0) 2+... + We D2 = p dans 
$ Fi 


is valid for |z| < 1. To solve the problem, we exploit this result by expanding the 
function 1/(z — 3) in three different ways: 
1 =; 1 
= 2 n 
W II =-(1-) =-{1+12+(12) +...+(12)+...] 
3 








for |1z| < 1, that is |z| < 3, giving the power series 





1 2 
= —i -iz — фл -.. (I1 « 3) 


I I 


oot 


(b) 





--[1-(z-2)*(z-2 €...] (Iz-2] € D) 
giving the power series 


I 
-3 


IL. We .1 3. T3y 
(© а 72 


giving the power series 





=-]1-(z-2)-(z-2ł-... o2] 1) 


N 





1 -1.2.2.,.. (|z| > 3) 
2 


2 2 





чә 


N 


The three regions are sketched in Figure 4.21. Note that none of the regions includes 
the point z = 3, which is termed a singularity of the function, a concept we shall discuss 
in Section 4.5.1. 
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Figure 4.21 Regions 
of convergence for the 
series in Example 4.15. 


Example 4.16 


Solution 











z plane 


In Example 4.15 the whole of the circle | z| 2 3 was excluded from the three regions 
where the power series converge. In fact, it is possible to include any selected point in 
the z plane as a centre of the circle in which to define a power series that converges 
to 1/(z — 3) everywhere inside the circle, with the exception of the point z = 3. For 
example, the point z 2 4j would lead to the expansion of 


1 — i 1 
z-4j*4j-3 4j-3z-4j|., 
4j-3 


in a binomial series in powers of (z — 4j)/(4j — 3), which converges to 1/(z — 3) inside 
the circle 


Iz—4j|=|4) -3] = (64+ 9) =5 


We should not expect the point z = 3 to be included in any of the circles, since the 
function 1/(z — 3) is infinite there and hence not defined. 


Prove that both the power series ©; a,z” and the corresponding series of derivatives 
Yip na,z" have the same radius of convergence. 


Let R be the radius of convergence of the power series X7, , a, z". Since lim, ,.(a,z5) 2 0 
(otherwise the series has no chance of convergence), if | z;| « A for some complex number 
zy then it is always possible to choose 


[а,| < |20" 


for n > N, with N a fixed integer. We now use d' Alembert's ratio test, namely 








: . |a — 

if lim|4#) <1 then У d,Z' converges 
uds аһ п=0 

: . |d Z : 

if lim|—|7 1 then У a,z' diverges 
ns а, 








п=0 


38 
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The differentiated series Y,, o na,z" ! satisfies 
Y [na] < чаг" « Yn Г 
п=1 п=1 zl 


which, by the ratio test, converges if 0 < |z,| < R, since |z| «€ |z,| and |z| can be as 
close to R as we choose. If, however, |z| > R then lim, ,.(a,z") # 0 and thus 
lim, ,.. (1a,z" !) # 0 too. Hence R is also the radius of convergence of the differentiated 
series X, na, z"! 


The result obtained in Example 4.16 is important, since if the complex function 


fe Y, a 


converges in |z| < R then the derivative 


f'@= >, naz" 
n=1 
also converges in |z| < R. We can go on differentiating f(z) through its power series 
and be sure that the differentiated function and the differentiated power series are equal 
inside the circle of convergence. 


4.4.2 Exercises 





Find the power series representation for the 39 Find the power series representation of the function 
function 1/(z — j) in the regions 
uy 

(a) |z| € 1 fe) = +1 
(b) |z 7 1 in the disc |z| < 1. Use Example 4.16 to deduce the 
(c) Iz- 1- jl € V power series for 
Deduce that the radius of convergence of the 1 1 

: : - 2535 (a) ————— (b) —— 
power series representation of this function is (г? i y (2 M 1)? 
[zo — j|, where z = zy is the centre of the circle of 
convergence (Zo + j). valid in this same disc. 


4.4.3 Taylor series 


In Modern Engineering Mathematics we introduced the Taylor series expansion 
2 оо п 
fe + a) =fla) + = fa) + 79а) +... = У 7а) (4.31) 
1! 2! an! 


of a function f(x) of a real variable x about x = a and valid within the interval of con- 
vergence of the power series. For the engineer the ability to express a function in such 
a power series expansion is seen to be particularly useful in the development of numer- 
ical methods and the assessment of errors. The ability to express a complex function as 


300 FUNCTIONS OF A COMPLEX VARIABLE 





z plane 


Figure 4.22 Region 
of convergence of the 
Taylor series. 


Example 4.17 


Solution 


a Taylor series is also important to engineers in many fields of applications, such as 
control and communications theory. The form of the Taylor series in the complex case 
is identical with that of (4.31). 

If. f(z) is a complex function analytic inside and on a simple closed curve C (usually 
a circle) in the z plane then it follows from Example 4.16 that the higher derivatives of 
f(z) also exist inside C. If z) and z + A are two fixed points inside C then 


2 n 
fin B) = flea) + Wf eo) + ESO. ESO. 
where f(z,) is the kth derivative of f(z) evaluated at z = zy. Normally, z = z) + h is 
introduced so that A — z — zp, and the series expansion then becomes 


fle) = flan) + E- e) + EEL foet... 


+ E ео S ETAT pes) (4.32) 


n=0 


The power series expansion (4.32) is called the Taylor series expansion of the com- 
plex function f(z) about z). The region of convergence of this series is |z — z| < R, 
a disc centred on z = z, and of radius R, the radius of convergence. Figure 4.22 
illustrates the region of convergence. When z, = 0, as in real variables, the series expan- 
sion about the origin is often called a Maclaurin series expansion. 

Since the proof of the Taylor series expansion does not add to our understanding 
of how to apply the result to the solution of engineering problems, we omit it at this 
stage. 


Determine the Taylor series expansion of the function 


1 


Й) = ap 


about the point z = j: 


(a) directly up to the term (z — Ј)*, 


(b) using the binomial expansion. 


Determine the radius of convergence. 


(a) The disadvantage with functions other than the most straightforward is that 
obtaining their derivatives is prohibitively complicated in terms of algebra. 
It is easier in this particular case to resolve the given function into partial 
fractions as 


sl -If tl 
ы ке; ncs j 





Example 4.18 


Solution 
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The right-hand side is now far easier to differentiate repeatedly. Proceeding to 
determine f(j), we have 


fz) = 2 2 3 so that f(j)=1 

2j\z-2j) z 

6%) = tf|-—L~ +1], soma fy =o 
2j| -2f 2] : 

Pozzi) som $= 
ıl 6 6 

f= 211 (z-2jy + 5 , sothat f(j)=0 

f° = 5 7 | , sothat f(j)=24 





leading from (4.32) to the Taylor series expansion 


l = 
z(z — 2j) 
21-G-jy-(G-jy-... 


2 2, 24 ET 
1-36-» Teu "Es 


(b) To use the binomial expansion, we first express z(z — 2j) as (z - j * j)z – Ј – ]Ј), 
which, being the difference of two squares ((z — j)? — j^), leads to 


1 


Se B [bro 
xm (®—]) +1 


fe- 


Use of the binomial expansion then gives 
Да) 1 = (Е 08) 
valid for |z — j| < 1, so the radius of convergence is 1. 


The points where f(z) is infinite (its singularities) are precisely at distance 1 away 
from z = j, so this value for the radius of convergence comes as no surprise. 


Suggest a function to represent the power series 


z Zz z” 
Dah ea at ae s 
2! 3! n! 


and determine its radius of convergence. 


Set 


= Zz Z ez 
/@=1+2+ T k ur 
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Assuming we can differentiate the series for f(z) term by term, we obtain 


оо n-1 оо п-1 


, nz 2 
Zz) = — = ——= Ж 
O) 23 25D А2) 
Hence f(z) is its own derivative. Since e” is its own derivative in real variables, and is 
the only such function, it seems sensible to propose that 


fe) = у Z =e (4.33) 


the complex exponential function. Indeed the complex exponential e^ is defined by 
the power series (4.33). According to d'Alembert's ratio test the series 127, a, is 
convergent if |a,.,/a,| — L « 1 as n > œ, where L 1s a real constant. If a, = z"/n! then 
|a,,,/a,| =|z|/(m + 1) which is less than unity for sufficiently large n, no matter how 
big |z| is. Hence X7, ,z"/n! is convergent for a// z and so has an infinite radius of con- 
vergence. Note that this is confirmed from (4.30). Such functions are called entire. 


In the same way as we define the exponential function e? by the power series expan- 
sion (4.31), we can define the circular functions sinz and cosz by the power series 





expansions 
ES Zz z е 
ees a t CD arm Pee 
y z zi zo 
=1-®—+——-—+...+(—1)" +... 
ee 21 41 е! CD On 


both of which are valid for all z. Using these power series definitions, we can readily 


prove the result (4.25), namely 


e* 2 cosz * jsinz 


4.4.4 Exercises 


40 Find the first four non-zero terms of the Taylor 
series expansions of the following functions about 
the points indicated, and determine the radius of 
convergence in each case: 


с=) Ы 


ies z(z — 4j) 





@=2) 


() 4 @=14)) 
2 


41 Find the Maclaurin series expansion of the function 


1 
І ++ 2° 


fe) = 


up to and including the term in z^. 


Without explicitly finding each Taylor series 
expansion, find the radius of convergence of 
the function 


Аа) = = 


2-1 





about the three points z = 0, z= 1 +j and z =2 + 2j. 
Why is there no Taylor series expansion of this 
function about z = j? 


Determine a Maclaurin series expansion 
of f(z) = tanz. What is its radius of 
convergence? 
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4.4.5 Laurent series 


Figure 4.23 
The Riemann sphere. 


Figure 4.24 Region of 
validity of the Laurent 
series. 


Let us now examine more closely the solution of Example 4.15(c), where the power 
series obtained was 
.1,3,9 
Bw qe qo qs 
2 2 2 


1 
2-3 





valid for |z| > 3. In the context of the definition, this is a power series about ‘z = оо’, 
the ‘point at infinity’. Some readers, quite justifiably, may not be convinced that there 
is a single unique point at infinity. Figure 4.23 shows what is termed the Riemann 
sphere. A sphere lies on the complex z plane, with the contact point at the origin O. Let 
O’ be the top of the sphere, at the diametrically opposite point to O. Now, for any 
arbitrarily chosen point P in the z plane, by joining O’ and P we determine a unique 
point P’ where the line O’P intersects the sphere. There is thus exactly one point P’ on 
the sphere corresponding to each P in the z plane. The point O' itself is the only point 
on the sphere that does not have a corresponding point on the (finite) z plane; we there- 
fore say it corresponds to the point at infinity on the z plane. 


o 





z plane 





Returning to consider power series, we know that, inside the radius of convergence, 
a given function and its Taylor series expansion are identically equal. Points at which 
a function fails to be analytic are called singularities, which we shall discuss in 
Section 4.5.1. No Taylor series expansion is possible about a singularity. Indeed, a 
Taylor series expansion about a point z, at which a function is analytic is only valid 
within a circle, centre zy, up to the nearest singularity. Thus all singularities must be 
excluded in any Taylor series consideration. The Laurent series representation includes 
(or at least takes note of) the behaviour of the function in the vicinity of a singularity. 

If f(z) is a complex function analytic on concentric circles C, and C; of radii r, and 
r, (with r, € rj), centred at z), and also analytic throughout the region between the 
circles (that is, an annular region), then for each point z within the annulus (Figure 4.24) 
Гс) тау be represented by the Laurent series 





z plane 
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Example 4.19 


Solution 


Ла) = Y Cz i Zo) 


C, ЕВ C nl AS Cı (4.34) 


(Z-z) (z-z) Z- Zo 


(Соо) 4ке = 


S 








where in general the coefficients c, are complex. The annular shape of the region is 
necessary in order to exclude the point z = z), which may be a singularity of f(z), from 
consideration. If f(z) is analytic at z = z, then c, = 0 for n =—1, —2,..., —ce, and the 
Laurent series reduces to the Taylor series. 

The Laurent series (4.34) for f(z) may be written as 


-1 оо 
Да) = W e-z * Y e- zu 
== n=0 
and the first sum on the right-hand side, the *non-Taylor' part, is called the principal 
part of the Laurent series. 

Of course, we can seldom actually sum a series to infinity. There is therefore often more 
than theoretical interest in the so-called ‘remainder terms’, these being the difference 
between the first n terms of a power series and the exact value of the function. For 
both Taylor and Laurent series these remainder terms are expressed, as in the case of 
real variables, in terms of the (” + 1)th derivative of the function itself. For Laurent series 
in complex variables these derivatives can be expressed in terms of contour integrals 
(Section 4.6), which may be amenable to simple computation. Many of the details are 
outside the scope of this book, but there is some introductory material in Section 4.6. 


For f(z) = 1/z*(z + 1) find the Laurent series expansion about (a) z = 0 and (b) z=—1. 
Determine the region of validity in each case. 


As with Example 4.15, problems such as this are tackled by making use of the binomial 
series expansion 


(1+2)"=1+nz+ Sy 48 п- 1 sies n-rt+l of 


provided that |z| < 1. 


(a) In this case z = 0, so we need a series in powers of z. Thus 


Spit {йд шү" 
2(1+2) = 
-lü-ze2-242-..) (0 <121<1) 
2 


Thus the required Laurent series expansion is 


І = ааа. 


z(z4 D z 2 


Example 4.20 


Solution 
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valid for 0 < |z| < 1. The value z = 0 must be excluded because of the first two 
terms of the series. The region 0 < |z| < 1 is an example of a punctured disc, а 
common occurrence in this branch of mathematics. 


(b) In this case z, 2 —1, so we need a series in powers of (z + 1). Thus 


Eo o oe 
Z(z+1) (2+1) 


d e. а 
= +1-6+ 0 


zel 2 
“Epp ERTA t.e] 


= +2+3@+1)+4@+1)?+... 


(2+1 1) 





valid for 0 < |z + 1| < 1. Note that in a meniscus-shaped region (that is, the 
region of overlap between the two circular regions |z| < 1 and |z+ 1| < 1) both 
Laurent series are simultaneously valid. This is quite typical, and not a cause for 
concern. 


Determine the Laurent series expansions of 


1 


Қа) = стти; 


valid for 


(a) 1<|z|<3 


(b) |z|>3 
(с) 0<|2+1|<2 
(d |z|«1 


(a) Resolving into partial functions, 


sa «(2-3 


Since |z| 7 1 and |z| € 3, we express this as 


Ја) = ОВ Ске 





27 


1,1 I 2 3 
= -5+5-5®+...)-10-+# = г +...) 


1 1 1 l i 


1 1,2 i 3 
=... — — + — -— +- i+ iz- iz + tr... 
6 18 54 162 
2z 22 22? Oz 
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e fe- lea)" (стз) 


Since |z| > 3, we express this as 
_ 1 1 1 
/@= zin. x re) 
-1 
"s s) 
Z 
-1(1- ig = Zt... Jom(1-34+3-8 e 
Z 


22 2 2 











w 


+ 


13 40 
= +... 


2 2 


Nole 
м1 


(c) We can proceed as in Example 4.18. Alternatively, we can take z 4 1 — u; then 
0 < [u| « 2 and 





1 1 
и} = = 
fe) и(и+2) 2и(1+3и) 
2 3 
- i -huelu Б 
giving 
"E 2 
fz) = -b*tintl)-IiG- l)e... 





us 4 


_ 1 
(r. fe) = e 2(z43) 








Since |z| < 1, we express this as 


Soe 
eee 6(1+ iz) 


= 1+2). - 1+1) 





Ла) = 


= (1-2+@-+.. )-1 (1-1 1z-iz... .) 


—1.42,1.2. 40,3 
= 3 92 + 552 312 Tus 


!* about 


Example 4.21 Determine the Laurent series expansion of the function f(z) = ze 
(a) z=0 
(D z=a,a finite, non-zero complex number 


(c) z=% 


Solution (а) From (4.33), 


2 
аан ee (0 < |z| < =) 


(b) 


(c) 
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Substituting 1/z for z, we obtain 


T (0 « |z| s eo) 
z 212 
so that 


2. 1 1 1 
#е'#=2+2+—+-—+—— + 


"ET ak uvm (0 « |z| S œ) 


This series has infinitely many terms in its principal part, but stops at z? (it is 
written back to front). Series with never-ending principal parts are a problem, and 
fortunately are uncommon in engineering. Note also that the series is valid in an 
infinite punctured disc. 


The value of f(a) must be à? e"^, which is not infinite since a z 0. Therefore f(z) 
has a Taylor series expansion 

2 
flz) = fla) + ауа) + FEB fa) +... 


about z = a. We have 


(а) = I (2° el^) = 377 el” _ ze 
2 


fe = i (322 е! ы ге!) =6z е! E 4 el + l е! 
2 


giving the series as 
23 е! = a е! 4 (z = а)(За? el^ —a e^) 
1 2 l/a la, l a 
Toe) бае -4е +—е ds 
. а 


which is valid in the region |z — a| « R, where R is the distance between the 
origin, where f(z) is not defined, and the point a; hence R = |a|. Thus the region 
of validity for this Taylor series 15 the disc |z — a| < |а]. 


To expand about z = œ, let w = 1/z, so that 
olw 
f(z) = =e 
w 


Expanding about w = 0 then gives 
ОИ. wow 

Fe Se ee ss 
w? w 2! 3! 


=н. (ори) 
Ww Ww 2!w 3! 4! 


Note that this time there are only three terms in the principal part of f(z)(— f(1/w)). 
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4.4.6 Exercises 


44 Determine the Laurent series expansion of (a) z=0 (b) z = œ 
1 c) z= a, a finite non-zero complex number 
fz) = —— v P 
z(z-1) (For (c), do not calculate the coefficients explicitly.) 

about (a) z = 0 and (b) z = 1, and specify the region 

of validity for each. 46 Expand 
45 Determine the Laurent series expansion of the №) = —— 

function (z - 1)(2 - z) 


in a Laurent series expansion valid for 


E 
ХӘ) = 2 sin (а) 121<1  (bi«Iz«2  (9izi»2 


about the points (d) |z- 1| 7 1 (e 0 €|z-2| € 1 


Singularities, zeros and residues 


4.5.1 Singularities and zeros 


As indicated in Section 4.4.5 a singularity of a complex function f(z) is a point of 
the z plane where f(z) ceases to be analytic. Normally, this means f(z) is infinite at such 
a point, but it can also mean that there is a choice of values, and it is not possible to 
pick a particular one. In this chapter we shall be mainly concerned with singularities 
at which f(z) has an infinite value. A zero of f(z) is a point in the z plane at which 
f(z) - 0. 

Singularities can be classified in terms of the Laurent series expansion of f(z) about 
the point in question. If f(z) has a Taylor series expansion, that is a Laurent series 
expansion with zero principal part, about the point z = Zo, then Zo is a regular point of 
f(z). If f(z) has a Laurent series expansion with only a finite number of terms in its 
principal part, for example 


Am а_ 


cuis 
(z- zy)" (z - zo) 








Қ) = +а+а(2- 2) +... +а,(2- 20)" +... 

then f(z) has a singularity at z — z, called a pole. If there are m terms in the principal 
part, as in this example, then the pole is said to be of order m. Another way of defining 
this is to say that Zo is a pole of order m if 


lim (z - z"f(z) - a ,, (4.35) 


where a. ,, is finite and non-zero. If the principal part of the Laurent series for f(z) at 
Z = Z has infinitely many terms, which means that the above limit does not exist for any 
m, then z = z is called an essential singularity of f(z). (Note that in Example 4.20 the 
expansions given as representations of the function f(z) = 1/[(z + 1)(z + 3)] in parts (a) 
and (b) are not valid at z = 0. Hence, despite appearances, they do not represent a 


Example 4.22 


Solution 
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function which possesses an essential singularity at z = 0. In this case f(z) is regular at 
z 2 0 with a value }.) 


If f(z) appears to be singular at z = z, but it turns out to be possible to define a Taylor 


series expansion there, then z = z, is called a removable singularity. The following 
examples illustrate these cases. 


(a) 
(b) 
(c) 
(d) 


f(z) =z" has a pole of order one, called a simple pole, at z = 0. 
f(z) = (z— 1)* has a pole of order three at z= 1. 
fE) = е! has an essential singularity at z = j. 
The function 
"ici —. 
(z*2)(z-3y 
has a zero at z= 1, a simple pole at z = —2 and a pole of order two at z = 3. 


The function 
_ sinz 
fz)= S 


is not defined at z = 0, and appears to be singular there. However, defining 


nm m г)/2 (2% 0) 
1 (2 = 0) 


gives a function having a Taylor series expansion 


zz 
gnégclcs- cas 
that is regular at z = 0. Therefore the (apparent) singularity at z = 0 has been 
removed, and thus f(z) = (sinz)/z has a removable singularity at z = 0. 


Functions whose only singularities are poles are called meromorphic and, by and 
large, in engineering applications of complex variables most functions are meromorphic. 
To help familiarize the reader with these definitions, the following example should 
prove instructive. 


Find the singularities and zeros of the following complex functions: 


(c) 


(a) 


1 -1 

4 2 . . (b) 4 = . . 
z -z(l*j])-*j z -z(l*])*j 

sin(z - 1) (d 1 
z-züsjsj [2° - 2(1+ј) +} 
For 

1 
fe-— 


zZ-2(1+j)tj 
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(b) 


(c) 


the numerator is never zero, and the denominator is only infinite when z is 
infinite. Thus f(z) has no zeros in the finite z plane. The denominator is zero 
when 


Z-—7(1+j)+j=0 
which factorizes to give 

(2-12 j) =0 
leading to 

22= 1 огј 
so that the singularities are at 

z-4H,-L (1392, (-1- A2 (4.36) 
all of which are simple poles since none of the roots are repeated. 
The function 

А) = т 

2-2 (1+ј) +] 
is similar to f(z) in (a), except that it has the additional term z — 1 in the numer- 
ator. Therefore, at first glance, it seems that the singularities are as in (4.36). 
However, a closer look indicates that f(z) can be rewritten as 
z=] 

(z- DG 1)[2+ /2(1+])][2- ү1(1 +])] 


and the factor z— | cancels, rendering z= 1 a removable singularity, and reducing 


fz) = 


Ло) 


1 
(2+ 1) [2+ (1 +])1[2- (501 +9] 


which has no (finite) zeros and z 2 —1, | (1 * j) and (1 (1 — j) as simple poles. 


f(z) = 


In the case of 
e sin(z - 1) 
i223 
z-z(14+j)tj 
the function may be rewritten as 
Д@= sin(z - 1) L 
z-l (+1)[2+ у5(1+])1[2- 50 3] 


Now 


Sm 1) >l аѕ 2591 
2-1 
so once again z = | is a removable singularity. Also, as in (b), z = —1, Jį (1 +j) 
and 45 (71 — j) are simple poles and the only singularities. However, 


sin(z - 1)20 


47 
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has the general solution z 2 1 - Nx (N 2 0, £1, £2, .. . ). Thus, apart from N — 0, 
all of these are zeros of f(z). 
(d) For 


1 
[24-71 +j) +i) 


f@) = 


factorizing as in (b), we have 


1 


Е = 
(2- 1) (2+ 1) [2+ +j) Iz- 30 +5)] 


so—1, +1, /}(1 +j) and \}(-1 — j) are still singularities, but this time they are 


triply repeated. Hence they are all poles of order three. There are no zeros. 


4.5.2 Exercises 








Determine the location of, and classify, the 48 Expand each of the following functions in a Laurent 
singularities and zeros of the following functions. series about z = 0, and give the type of singularity 
Specify also any zeros that may exist. (if any) in each case: 
1 - cos 
(а) 2082 ——— (o Oo 
2 (z +j) (z-j) 2-1 
22 
; e 
(d) cothz — (e) SB (f) et» (6) = 
Z +T 2 
zz] 2+] (с) 2 соѕћ 2' 
(9) е 
z +1 (z +2) (z-3) (d) tan((z) 4 2z 4- 2) 
(i) 1 49 Show that if f(z) is the ratio of two polynomials 
z’ (z -4z +5) then it cannot have an essential singularity. 
4.5.3 Residues 


If a complex function f(z) has a pole at the point z — z; then the coefficient a , of the 
term 1/(z — zp) in the Laurent series expansion of f(z) about z = z, is called the residue 
of f(z) at the point z = zy. The importance of residues will become apparent when 
we discuss integration in Section 4.6. Here we shall concentrate on efficient ways 
of calculating them, usually without finding the Laurent series expansion explicitly. 
However, experience and judgement are sometimes the only help in finding the easiest 
way of calculating residues. First let us consider the case when f(z) has a simple pole 
at Z = Zo. This implies, from the definition of a simple pole, that 


ai 





taytaí(z-z)t... 


fe- 


Z= Zo 
in an appropriate annulus S « |z — z| < R. Multiplying by z — z gives 


(Z — 2) f(z) = a_, + a(zZ—2Z)+... 
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Example 4.23 


Solution 


Example 4.24 


Solution 


which is a Taylor series expansion of (z — z;) f(z). If we let z approach z;, we then obtain 
the result 


residue ata _ j. Е Е 
simple polez, - a Е — 29) ft] - as (4.37) 


Hence evaluating this limit gives a way of calculating the residue at a simple pole. 


Determine the residues of 


2z 


fe)= =—2 — 
(7 +1)Qz-1) 


at each of its poles in the finite z plane. 


Factorizing the denominator, we have 


_ 22 
(2-ј)(2+])(22- 1) 
so that f(z) has simple poles at z = j, —j and i . Using (4.37) then gives 


f(z) 


residue jim EE 2z 
atz=j z>j (z-j)(z+j)(2z- 1) 
-—2j ee I 
2j(2j3- 1) 5 
residue. = п ШЙ i 2z 
atz=-j z>- (z-j)(2+j)(2z- 1) 
uuu. cu de 
-2j(-2j - 1) 5 
residue _ jy = 2 
= E Я 
az=} o -DEHE 


1 


(cpm 5 


Note in this last case the importance of expressing 2z — 1 as 2(z — } ). 


Determine the residues of the function 1/(1 4 z^) at each of its poles in the finite z plane. 


The function l/(1 4 z^) has poles where 
1+2=0 
that 1s, at the points where 


z = —] = emi 
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with n an integer. Recalling how to determine the roots of a complex number, these 
points are 


z= е/4+п)п/2 (п = 0, 1, 2, 3) 


that 1s 


mj/A 43mj4 д5ту4 7mj4 
„ , „ 


z= e, e ей е 
or 
z - (1*2, C1 € 42, C1 – 2, A - 2 


To find the residue at the point zy, we use (4.37), giving 


residue  ,. 2-20 
= lim | —— 
at Zo тәш 1 + a 


where z, is one of the above roots of z' — —1. It pays to use L'Hópital's rule before 
substituting for a particular zy. This is justified since (z — Z))/(1 + z,) is of the indeter- 
minate form 0/0 at each of the four simple poles. Differentiating numerator and 
denominator gives 


lim D. - lim (4) 
2-20 1 += 2—20 42 





since 4z? is not zero at any of the poles; 1/42? is thus the value of each residue at z = лу. 
Substituting for the four values (+1 + J)//2 gives the following: 


residue 1 


py ST = —(1 +])/4у2 
atz=(14+j)//2 A(D' 0 jy \ 
residue 1 . 

meg а а +])/4{2 
ишер Айе ` 
residue 1 a -»1?442 


at z=(-1+j)/\2 А AQ CA jy " 


residue 1 ; 
а aba — - (043/42 

аї2= (—1 —]1)//2 4(43 (71 my 
Finding each Laurent series for the four poles explicitly would involve far more 
difficult manipulation. However, the enthusiastic reader may like to check at least one 
of the above residues. 


Next suppose that we have a pole of order two at z 2 z,. The function f(z) then has 
a Laurent series expansion of the form 


a_ a_ 
2 FS 1 


(Z-Z) Z-Z 








fe) = 


tagta(z-z -... 
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Example 4.25 


Solution 


Again, we are only interested in isolating the residue a_,. This time we cannot use 
(4.37). Instead, we multiply f(z) by (z — zy? to obtain 


(2 – zo) f(z) =а5+а (2-2) + ас – zy +... 


and we differentiate to eliminate the unwanted a ;: 
d 2 Е 
dz [@ — 2) f@] = a1 + 2ag(z— 2) +... 


Letting z tend to z, then gives 
lim е - sn =i 
2-20 а= 

the required residue. 


We now have the essence of finding residues, so let us recapitulate and generalize. 
If f(z) has a pole of order m at z — Zo, we first multiply f(z) by (z — z))”. Ifm = 2, we 
then need to differentiate as many times as it takes (that 1s, m — 1 times) to make 
a_, the leading term, without the multiplying factor z — zy. The general formula for 
the residue at a pole of order m is thus 


m-1 
—L— lim | ie - an (4.38) 
(m - 1)! 25 | аат 
where the factor (m — 1)! arises when the term a ,(z — z;)" ! is differentiated m — 1 
times. This formula looks as difficult to apply as finding the Laurent series expansion 
directly. This indeed is often so; and hence experience and judgement are required. 
A few examples will help to decide on which way to calculate residues. A word 
of warning is in order here: a common source of error is confusion between the 
derivative in the formula for the residue, and the employment of L'Hópital's rule to 
find the resulting limit. 


Determine the residues of 


Е 22-25 
(2+ 1) (22 + 4) 


at each of its poles in the finite z plane. 


Factorizing the denominator gives 


flz) = oe ÁÀ 
(z* 1) (z - 2j(z 2j) 


so that f(z) has simple poles at z = 2j and z = —2j and a pole of order two at z — -1. 
Using (4.37), 


Example 4.26 


Solution 
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residue —  ,. i 22-22 
. - lim(z-2j) ———7———————— 
atz-2j гэ?) (z+ 1)°(z= 2j)(z + 2j) 
—4 - 4j 
= = (7+ )) 
Qj * 1) (4) 
š 2 
residue _= lim ШШ = -2z 
atz=—2j :223i (z* 1Y'(z - 2j) (z 2j) 
200 74 Hj _ E 
= 5(7-]) 


(—2} + 1)(—4]) 
Using (4.38) with m = 2 we know that 


| 2 
residue — — L lim ED (ЕЕ nl 
atzc-l  1!z54dz ШШДЕ? +4) 


ans (2 4)2z-2)-(Z-2z)0z). (5)—-4)-(3)(-2) ... 4 
= lim 2 = =— 
zd (27 +4) 25 


Determine the residues of the following functions at the points indicated: 








@—©— e-) Ы (902) 9 O45 ev 


(122)? d (z41y 


(a) Since 
а= 
w+ (20006-3) 
апа е? 1 regular at z =j, it follows that z = j is a pole of order two. Thus, from (4.38), 
: ae TO e 
residue = lim — | ES ———— —— 
a (z + j) Н 


2 42 g i z 
"mil e pes en 
dz 





z>j 


ety (2+) 
(Rje -2(2jjei yy yi 
= e = € (1 4je’ 
(2j) 
Since ef = cos 1 + j sin 1, we calculate the residue at z = j as 0.075 — j0.345. 


(b) The function [(sinz)/z?P has a pole at z = 0, and, since (sinz/z) > 1 as z 0, 
(sin’z)/z? may also be defined as 1 at z= 0. Therefore, since 


А a 3 
(822) . snzl 
2 mw jp 8 
2 г zz 





the singularity at z= 0 must be a pole of order three. We could use (4.38) to obtain 
the residue, which would involve determining the second derivative, but it is easier 
in this case to derive the coefficient of 1/z from the Laurent series expansion 
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sin z zo oz 

as – 1 -2+2 
2 3! 5! 

giving 

sinz ]l i p 3 
Jom 67 T 1207 

z 2 


Taking the cube of this series, we have 





2 3 3 
Еше е 
2 2 


Hence the residue at z= 0 is —}. 


(c) The function z‘/(z - 1) has a triple pole at z ^ —1, so, using (4.38), 


50 


Dl 


2-1 


d 
residue = lim 35 (2+1) 
dz 





ne 
(z+ 1) zr dz 


= lim !x4x3z = 6(-1) = 6 


z-1 ? 


Residues are sometimes difficult to calculate using (4.38), especially if circular func- 
tions are involved and the pole is of order three or more. In such cases direct calculation 
of the Laurent series expansion using the standard series for sinz and cosz together with 
the binomial series, as in Example 4.26(b), is the best procedure. 


4.5.4 Exercises 


Determine the residues of the following rational 
functions at each pole in the finite z plane: 


2z+1 1 
a [Bj ees 
Z-z-2 z(1-z) 
32 42 à £l 52 


(2- 1)(22+9) г +42 


(е) 2б +42 +2 +1 (f) ESI 





(z-1) z-1 
2+1 (h - 3 +42 
(2-1) (2+3) г +32 +22 


Calculate the residues at the simple poles indicated 
of the following functions: 


sinz 


(z = е”) 
z 44 


(a) ЕУ (z=0) (b) 


z-1 


(c) = e= (d-> (z-m) 
z +1 sin Z 


1 = 
(е) P ly @=]) 


The following functions have poles at the points 
indicated. Determine the order of the pole and the 
residue there. 


@ 925 (20) 





2 

(Dosen ы 
(2+1) (2 +4) 

(с) =; (z 2 nn, n an integer) 
sinz 


(Hint: use lim, ,y(sin uyu 2 1 (u = z — nn), after 
differentiating, to replace sin u by u under the limit.) 
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rr Contour integration 


Figure 4.25 
Partitioning of 
the curve C. 


Consider the definite integral 


| f(z) dz 


Td 

of the function f(z) of a complex variable z, in which z, and z, are a pair of complex 
numbers. This implies that we evaluate the integral as z takes values, in the z plane, 
from the point z, to the point z,. Since these are two points in a plane, it follows that to 
evaluate the definite integral we require that some path from z, to z, be defined. It is 
therefore clear that a definite integral of a complex function f(z) is in fact a line integral. 

Line integrals were considered in Section 3.4.1. Briefly, for now, a line integral in 
the (x, y) plane, of the real variables x and y, is an integral of the form 


| [Р(х, y) dx + Q(x, y)dy] (4.39) 


where C denotes the path of integration between two points A and B in the plane. In the 
particular case when 


ӘР _ 20 
dy ox 


the integrand P(x, y)dx + Q(x, y)dy is a total differential, and the line integral is 
independent of the path C joining A and B. 

In this section we introduce contour integration, which is the term used for evaluat- 
ing line integrals in the complex plane. 


(4.40) 


4.6.1 Contour integrals 


Let f(z) be a complex function that is continuous at all points of a simple curve C in the 
z plane that is of finite length and joins two points a and b. (We have not gone into great 
detail regarding the question of continuity for complex variables. Suffice it to say that 
the intuitive concepts described in Chapter 9 of Modern Engineering Mathematics for 
real variables carry over to the case of complex variables.) Subdivide the curve into n 
parts by ће роїпіѕ 21, 2,,..., 2,1, taking z) = a and z, = b (Figure 4.25). On each arc 
joining z,_, to z,(kK=1,...,n) choose a point Z,. Form the sum 





z plane 
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Example 4.27 


Figure 4.26 
Path of integration 
for Example 4.27. 


Solution 


Sp, 2 f) zi — 2) +02), – 21) t... f - 2,1) 
Then, writing z, — z, , = Az,, S,, becomes 


$,- V, f) Az 


k=1 
If we let n increase in such a way that the largest of the chord lengths |Az,| approaches 
zero then the sum S, approaches a limit that does not depend on the mode of subdivision 
of the curve. We call this limit the contour integral of f(z) along the curve C: 


| fiz) dz = lim у f) Az (4.41) 
C k=1 


|Az,|30 = 
If we take z =x + jy and express f(z) as 


JE) = u(x, y) + jv, y) 
then it can be shown from (4.41) that 
| А24 = | [u(x, y) t juGx, y) (dx * j dy) 
C € 


or 


| f(z) dz -| [u(x, y) dx — v(x, y) dy] 


+ | [v(x, y) dx + u(x, y) dy] (4.42) 


Both of the integrals on the right-hand side of (4.42) are real line integrals of the 
form (4.39), and can therefore be evaluated using the methods developed for such 
integrals. 


Evaluate the contour integral fc z?^dz along the path C from —1 - j to 5 + j3 and com- 
posed of two straight line segments, the first from —1 + j to 5 + j and the second from 
54jto54j3. 





z plane 


The path of integration C is shown in Figure 4.26. Since 
Z = (x + jy = (0 – у) + j2xy 


Example 4.28 
Solution 
y 
DO + j) Bd +j) 


о А(1+ 0) х 
z plane 
Figure 4.27 Path 


of integration for 
Example 4.28. 
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it follows from (4.42) that 


I 


| а [G? - y?) dx - 2xy dy] +i| [2xy dx + (x? - y) dy] 
c c c 


Along AB, y = 1 and dy - 0, so that 


5 3 
| o- nas| 2x dx 
-1 =i 
- [138 — x * jp? E, = 36 + j24 
Along BD, x = 5 and ах = 0, so that 


3 3 
Гое | -10y dy «| (25 - y) dy 


1 1 
= [-5у *j25v - Ey 
=—40+j%4 


Thus 


| e hot ao Genio scere eg 
C 


Show that fc (z - 1) dz 2 0, where C is the boundary of the square with vertices at z = 0, 
z=1+)j0,z=1+jlandz=0+)1. 


The path of integration C is shown in Figure 4.27. 
Since z + 1 = (x + 1) + jy, it follows from (4.42) that 


i=| езде ГЕТА [y dx * (x * 1) dy] 
с с c 


Along OA, y = 0 and dy = 0, so that 


І 
s] (х+1) х= 


0 


Along AB, x = | and dx = 0, so that 


1 1 
ra= | -vasif 2dy = —}+j2 


0 0 


Along BD, y = 1 and dy = 0, so that 


0 0 
= | взен dx--i-j 


1 1 


Along DO, x = 0 and dx = 0, so that 
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4.6.2 


Theorem 4.1 


Proof 


0 0 
к= | teil dx = i-j 
1 1 


Thus 


| (2+ 1) dz = Toa + Lap + Гьар + Љо = 0 
C 


Cauchy's theorem 


The most important result in the whole of complex variable theory is called Cauchy's 
theorem and it provides the foundation on which the theory of integration with respect 
to a complex variable is based. The theorem may be stated as follows. 


Cauchy's theorem 


If f(z) is an analytic function with derivative f’(z) that is continuous at all points 
inside and on a simple closed curve C then 


} f(z)dz 2 0 


(Note the use of the symbol fç to denote integration around a closed curve, with the 
convention being that the integral is evaluated travelling round C in the positive or 
anticlockwise direction.) 


To prove the theorem, we make use of Green’s theorem in a plane, which was intro- 
duced in Section 3.4.5. At this stage a statement of the theorem is sufficient. 


If C is a simple closed curve enclosing a region A in a plane, and P(x, y) and Q(x, y) are 
continuous functions with continuous partial derivatives, then 


} (Рах + Оду) = | | (22 Е =) dx dy (4.43) 


Returning to the contour integral and taking 


Ха) = их, у) + јх, у),  z-x*jy 
we have from (4.42) 


} f(z) dz = } (udx-vdy) * i (v dx +u dy) (4.44) 


С 


Since f(z) 1s analytic, the Cauchy-Riemann equations 


du w др ди 


дх ду Әх ду 


are satisfied on C and within the region R enclosed by C. 


Figure 4.28 
Deformed contour for 
an isolated singularity. 
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Since u(x, y) and v(x, y) satisfy the conditions imposed on P(x, y) and Q(x, y) in 
Green's theorem, we can apply (4.43) to both integrals on the right-hand side of (4.44) 
to give 


f fede ЗЕЕ ЕЕ 


R R 


by the Cauchy-Riemann equations. Thus 


} fz) dz = 0 


as required. 


end of theorem 


In fact, the restriction in Cauchy’s theorem that f’(z) has to be continuous on C can 
be removed and so make the theorem applicable to a wider class of functions. A revised 
form of Theorem 4.1, with the restriction removed, is referred to as the fundamental 
theorem of complex integration. Since the proof that f’(z) need not be continuous on 
C was first proposed by Goursat, the fundamental theorem is also sometimes referred 
to as the Cauchy—Goursat theorem. We shall not pursue the consequences of relaxa- 
tion of this restriction any further in this book. 

In practice, we frequently need to evaluate contour integrals involving functions such as 


ios, A 
z-2 (2-3) (2+2) 


2 





that have singularities associated with them. Since the function ceases to be analytic 
at such points, how do we accommodate for a singularity if it is inside the contour of 
integration? To resolve the problem the singularity is removed by deforming the contour. 

First let us consider the case when the complex function f(z) has a single isolated 
singularity at z = zo inside a closed curve C. To remove the singularity, we surround it 
by a circle y, of radius p, and then cut the region between the circle and the outer 
contour C by a straight line AB. This leads to the deformed contour indicated by the 
arrows in Figure 4.28. In the figure the line linking the circle y to the contour C is 
shown as a narrow channel in order to enable us to distinguish between the path A to 
B and the path B to A. The region inside this deformed contour is shown shaded in the 
figure (recall that the region inside a closed contour is the region on the left as we travel 
round it). Since this contains no singularities, we can apply Cauchy’s theorem and write 


У 





z plane 
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Example 4.29 


Solution 





z plane 


Figure 4.29 A circle 
of radius p, centred 
at the origin. 


} f(z) dz + | f(z) dz + } лә) dz } f(z) dz =0 
C AB Y BA 


Since 


this reduces to 


} f(z) dz = } f(z) dz (4.45) 
C yt 


with the + indicating the change of sense from clockwise to anticlockwise around the 
circle y. 


Evaluate the integral 6. dz/z around 


(a) any contour containing the origin; 
(b) any contour not containing the origin. 


(a) f(z) = l/z has a singularity (a simple pole) at z = 0. Hence, using (4.45), the 
integral around any contour enclosing the origin is the same as the integral around 
a circle y, centred at the origin and of radius pọ. We thus need to evaluate 


} lag 
yZ 


As can be seen from Figure 4.29, on the circle y 
z 2 pyel? (0 « 0 < 2л) 

so 
dz - jp, e? d0 


leading to 


2n, je 2n 
) 1: | iaae- | jd - 2nj 
ү? o Poe 0 


Hence if C encloses the origin then 


} dz. nj 
с 2 


(b) If C does not enclose the origin then, by Cauchy's theorem, 


с 2 


since 1/z is analytic inside and on any curve that does not enclose the origin. 


Example 4.30 


Solution 


Example 4.31 


Solution 
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Generalize the result of Example 4.29 by evaluating 


pe 
cz 


where n is an integer, around any contour containing the origin. 


If « 0, we can apply Cauchy’s theorem straight away (or evaluate the integral directly) 
to show the integral is zero. If n > 1, we proceed as in Example 4.29 and evaluate the 
integral around a circle, centred at the origin. Taking z = p, e? as in Example 4.29, we have 


4 2n. e? 
az _ | 10% = 40 
с), е 


where p, is once more the radius of the circle. If n 2 1, 


dz i dé 1-«| е Ө E Po” meni 
dz _ ...dO _ stn) © JU — Во _ -п)20ј | 1) = 
: 2" j| п-1 QU e JPo |: 1 ze ) 0 


o Po 0 
2xjN .— 


since e^?" — ] for any integer N. Hence 


} @_9 (#1) 
с 2 


In Examples 4.29 and 4.30 we have thus established the perhaps surprising result 
that 1f C is a contour containing the origin then 


dz [2mj (n=1) 
gm 0 (n any other integer) 


If C does not contain the origin, the integral is of course zero by Cauchy's theorem. 


Evaluate the integral 


} dz 
aera] 
around any contour C containing the point z = 2 + j. 


The function 


1 
gale 





Az) = 


has a singularity (simple pole) at z= 2 + j. Hence, using (4.45), the integral around any 
contour C enclosing the point z = 2 + j is the same as the integral around a circle y 
centred at z= 2 +j and of radius р. Thus we need to evaluate 


} dz 
25—10 
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z plane 


Figure 4.30 A circle 
of radius p centred at 
24]. 





Figure 4.31 
Deformed contour 
for n singularities. 


Example 4.32 


Solution 


As can be seen from Figure 4.30, on the circle y 
z=(2+j)+pe” (0x0-«2m) 
dz =jpe d8 
leading to 


2n. jo 2n 
j =| ios a9 = | 340 = 27] 
y727 ^7] o pe 


0 


Hence if C encloses the point z = 2 +] then 


} dz = 2л] 
© з=] 


Compare this with the answer to Example 4.29. 





So far we have only considered functions having a single singularity inside the 
closed contour C. The method can be extended to accommodate any finite number of 
singularities. If the function f(z) has a finite number of singularities at z= z,, z, . . . , Zp 


n 


inside a closed contour C, then we can deform the latter by introducing n circles y,, %, 
. , ¥, to surround each of the singularities as shown in Figure 4.31. It is then readily 
shown that 


} f(z) dz = } f(z) dz * } Д)42+...+ } f(z)dz (4.46) 
C n 7 Yn 


Evaluate the contour integral 


zdz 
„= 1)(z+2j) 


where C is 


(a) any contour enclosing both the points z = 1 and z = –2); 


(b) any contour enclosing z = —2j but excluding the point z = 1. 


The function 


_ 2 
=) 


has singularities at both z= | and z = —2). 
(a) Since the contour encloses both singularities, we need to evaluate the integrals 


around circles y, and y; of radii p, and p, surrounding the points z 2 1 and z 2 2j 
respectively. Alternatively, we can resolve f(z) into partial fractions as 
1 . 1 . 
(2-j2 | i4«2j) 
fü) === + 


z-1 zt2] 
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and consider 


dz ; dz : dz 
I2 ——Z92 —— -21(1-2 +1(4+2 Æ =] +I 
——— 3( np ee ip ты Ta 


C 





The integrand of J, has a single singularity at z = 1, and we simply need to 
evaluate it around the circle y, of radius p, about z 2 1 to give 


1, = 27] 


Similarly, J, has a single singularity at z 2 —2j, and we evaluate it around the circle 
y; to give 


1 = 217] 
Then 
I=} (1-j2)2rj + į (4 + j2)27j = 2nj(# - j£) 
Thus if the contour C contains both the singularities then 
b censi -io 
(b) Ifthe contour C only contains the singularity z 2 —2j then 


zdz TIE 
A = 2л) +39 
(z- 1)(z+ 2j) pom 


In Examples 4.29—4.32 we can note some similarity in the answers, with the common 
occurrence of the term 2j. It therefore appears that it may be possible to obtain some 
general results to assist in the evaluation of contour integrals. Indeed, this is the case, 
and such general results are contained in the Cauchy integral theorem. 


Theorem 4.2 Cauchy integral theorem 


Let f(z) be an analytic function within and on a simple closed contour C. If z, is any 
point in C then 


} faz = 2nj f(Z) (4.47) 


If we differentiate repeatedly n times with respect to z under the integral sign then it 
also follows that 


} f az= a F) (4.48) 
с(@= zy Le 
end of theorem 


Note that (4.48) implies that if f’(z) exists at z =z, so does f(z) for all n, as predicted 
earlier in the observations following Example 4.16. 
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Example 4.33 


Solution 


Example 4.34 


Solution 


Evalute the contour integral 


} 22 
———————— dim 
c CG- DGt* 2G) 


where C is a contour that includes the three points z 2 1, z 2 —2 and z — —-j. 


Since 
2z 


К = туа уар 
has singularities at the points z 2 1, z 2 —2 and z =—j inside the contour, it follows from 
(4.46) that 


} f(z) dz = } f(z) dz + } f(z) dz + } f(z) dz (4.49) 
с | n ђ 


where yi, № апа y, are circles centred at the singularities z = 1, z = —2 and z = -j 
respectively. In order to make use of the Cauchy integral theorem, (4.49) is written as 


} лә} els nies pla, d i2z/[IG - DG * Dra a, 
С N Ya 


2-1 z+2 


4 {2z/[(z- 1)(2+2) Lds 
73 


2+) 
4 iud fae ВЭ) i, 
42-1 „2+2 „2+1 


Since f,(z), f(z) and f(z) are analytic within and on the circles y,, Y, and y; respectively, 
it follows from (4.47) that 


} f) dz = 27jl AA) + 5(—2) + 5(—])1 


= 21) zd i + с + - -2j - 
2(1 +j) (-3)(-2+j) (-j-1)(-j+2) 
so that 


2z dz -0 
os 1)(2+2)(2+]) 


Evaluate the contour integral 


} 2 ;dz 
с (2-1) 


where the contour C encloses the point z = 1. 





Since f(z) = z‘/(z — 1) has a pole of order three at z = 1, it follows that 


} fee E ads 
с y (z- 1) 


53 


54 


55 


56 
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where y is a circle centred at z = 1. Writing f,(z) = 2°, then 


} лае} Ag, 
c y (z-l) 


and, since fi(z) 1s analytic within and on the circle y, it follows from (4.48) that 


} Д)4: = 2л] э E е) 


so that 


4 
} #—_42 = 12л) 
c (z- 1) 





4.6.3 Exercises 


Evaluate J.(z* + 3z) dz along the following contours by 
C in the complex z plane: 


(a) the straight line joining 2 + j0 to 0 + j2; 

(b) the straight lines from 2 + j0 to 2 + j2 and then 
to 0 4 j2; 

(c) the circle |z| 2 2 from 2 + j0 to 0 + j2 in an 
anticlockwise direction. 


Evaluate $,(5z* — z’ + 2) dz around the following 
closed contours C in the z plane: 


58 
(a) the circle |z| = 1; 
(b) the square with vertices at 0 + j0, 1 +0, 
1+jl and0+jl; 
(c) the curve consisting ofthe parabolas y 2x? from 
0 4 j0 to 1  j1 and 5? = x from 1 +j to 0 + j0. 
Generalize the result of Example 4.30, and show that 
} dz - [or (п= 1) 
с (2-2) 0 (n # 1) 59 


where C is a simple closed contour surrounding 
the point z = Zp. 


Evaluate the contour integral 


} dz 
z-4 
e 


where C is any simple closed curve and z = 4 is 


(b) inside C 





(a) outside C 


- nj(122).4 


Using the Cauchy integral theorem, evaluate the 
contour integral 


2zdz 
,Qz- DG-2) 
where C is 


(a) the circle |z| 2 1 
(b) the circle |z| 2 3 


Using the Cauchy integral theorem, evaluate the 
contour integral 


— 5d 
ё (z* 1)(z- 2)(z * 4j) 
where C is 


(a) the circle |z| 2 3 
(b) the circle |z| 2 5 


Using the Cauchy integral theorem, evaluate the 
following contour integrals: 


(а) } E 
5 Qz4 1) 


where C is the unit circle |z| = 1; 


b) } A 
с (2- 1)(2+2) 


where C is the circle |z| = 3. 
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4.6.4 


The residue theorem 


This theorem draws together the theories of differentiation and integration of a complex 
function. It is concerned with the evaluation of the contour integral 


І = } f(z) dz 


where the complex function f(z) has a finite number 7 of isolated singularities at z,, 
Z,..., Z, side the closed contour C. Defining the contour C as in Figure 4.31, we 
have as in (4.46) that 


1 = } f(z) dz = } f(z)dz * } fz)dz+...+ } f(z) dz (4.46) 
c n Yo Yn 


If we assume that f(z) has a pole of order m at z = z; then it can be represented by the 
Laurent series expansion 


(i) (i) 
fz) = 1 а 


(2-2,)" БЕ. 


+a + а® (2-2) Е а®(2- 21)" AE cg 








valid in the annulus r; € |z — z;| < R,. If the curve C lies entirely within this annulus 
then, by Cauchy’s theorem, (4.46) becomes 


Т = } f(z) dz -$ f(z) dz 
c у 


і 


Substituting the Laurent series expansion of f(z), which we can certainly do since we 
are within the annulus of convergence, we obtain 


(i) (i) 
a- a. ; ; 
} fíz)dz - } Rees —. 4 aD e aP(z-z) 4... 
Yi Yi 


(z--)" £M 


+ a® (z - z;)” deus E 


i dz i dz i 
ZI —— £a —— ra o dz 
ГА (2- 2;) Yi iE Yi 


наф (2-2) д2 +... 


Using the result from Exercise 55, all of these integrals are zero, except the one 
multiplying a“), the residue, which has the value 27j. We have therefore shown that 


} f(z) dz = 2nja“ = 2nj x residue at z = z; 
Yı 
This clearly generalizes, so that (4.46) becomes 


I- } f(z) dz = 2nj у (residue at z — z;) 
C i=l 
= 2nj x (sum of residues inside С) 


Thus we have the following general result. 


Theorem 4.3 


Example 4.35 


Solution 


Example 4.36 


Solution 
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The residue theorem 


If f(z) is an analytic function within and on a simple closed curve C, apart from 
a finite number of poles, then 


} f(z) dz 2 2nj x [sum of residues of f(z) at the poles inside C] 
С 


end of theorem 


This is quite a remarkable result in that it enables us to evaluate the contour integral 
$, f(z) dz by simply evaluating one coefficient of the Laurent series expansion of f(z) at 
each of its singularities inside C. 


Evaluate the contour integral $, dz/[z(1 + z)] if C is 


(a) the circle |z| = 1; (b) the circle |z| = 2. 


The singularities of 1/[z(1 +z)] are at z = 0 and —1. Evaluating the residues using (4.37), 
we have 





residue _ li D 
-limz ——— = 
аі2=0 = z50 z(1+z) 
residue _ lim (z+ 1) — 
atz--l 25-1 z(l +z) 


(a) IfCis|z|= i then it contains the pole at z — 0, but not the pole at z  —1. Hence, 
by the residue theorem, 





ы 2mj x (residue at z = 0) = 27) 
с®@+1 


(b) IfCis|z|=2 then both poles are inside C. Hence, by the residue theorem, 


dz x р. 
f ey = emia 1)=0 





3 2 
Evaluate the contour integral } ZI dz where C is 
с 2 +42 


(а) |21=1 (Ы) |2|=3 


The rational function 
z-zZtz-1 


2 +42 
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has poles at z 2 0 and € 2j. Evaluating the residues using (4.37) gives 


: 3 2 
residue — lim zz +2-1) 1 


atz=0 230 z(z) 4 4) С 


y z 3 2 
residue . (AZ —Z -z-1) 3.3 
= lim —2 


„= =-3+3 
at z= 2j 192) 2(z — 2j)(z + 2j) sta 
residue _ li (+ 2])(2-2+2-1)_ 3 3; 

= lum : р ==) 
atz--2j 25-2) z(z-— 2j)(z+ 2j) 


(Note that these have been evaluated in Exercise 50(d).) 


(a) IfCis|z|- 1 then only the pole at z — O is inside the contour, so only the residue 
there is taken into account in the residue theorem, and 


3 2 

z-z +z-1 : : 

} 2232-1 4. = 2пј(-1) = –1лј 
с 2 +42 


(b) If Cis |z| 2 3 then all the poles аге inside the contour. Hence, by the residue 
theorem, 


z-zZtz-l i 343 з 3; ў 
вно нЕ реа 
с г +42 


Example 4.37 . Evaluate the contour integral 


dz 
f z(z-42z42) 


where C is the circle |z| = 3. 


Solution The poles of 1/z*(z* + 2z + 2) are as follows: a pole of order three at z = 0, and two 
simple poles where z? - 2z 4 2 = 0, that is at z= —1 +j. All of these poles lie inside the 
contour C. 
From (4.38), the residue at z = 0 is given by 


lim 1 € 1 р о. |а) d | ae 
|S - lim; — | 53| = Ш | 5 E 
290 21422 | 2+ 22+ 2 2902 2 | (2 +22+2) 290 2 | (2 + 22+ 2) 
—1ип Ш@22#+2) +(+1)2(2+2#+2)(22+2)_1 

2-90 (2° +22 +42)' $ 


From (4.37), the residue at z = —1 — j is 


lim (2+1) S ——————— - lim 3 1 
z2-1-j 2(2+1+ј)(2+1-ј) 29-і (2+1 -ј) 
Cc1-p'C2p (Q-4)»2; C2*2)2j 


4.6.5 





z plane 


Figure 4.32 
The closed contour for 
evaluating f^. f(x) dx. 
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using (1 £j? 2 1 - 3j - 3j? P 2 —2 * 2j. Hence 


residue 1 1-j 


а ecu zm +] 
atzc-1-j *-1-j * 32 « » 
Also, using (4.37), 
residue = lim (z41-j)-4 1 
atz=-l4+j »э—+ z(z+1+j(z+1-j) 


which is precisely the complex conjugate of the residue at z = —1 — j. Hence we can take 
a short cut with the algebra and state the residue as 1 (-1 — j). 
The sum of the residues is 


Lel-1+j)+i-1-j) = 0 
so, by the residue theorem, 


} ———%—— —2л\(0)=0 


(2+ 22+ 2) 


Evaluation of definite real integrals 


The evaluation of definite integrals is often achieved by using the residue theorem 
together with a suitable complex function f(z) and a suitable closed contour C. In this 
section we shall briefly consider two of the most common types of real integrals that 
can be evaluated in this way. 


Type 1: Infinite real integrals of the form I f(x) dx where f(x) is a 
rational function of the real variable x 


To evaluate such integrals we consider the contour integral 


} f(z) dz 


where C is the closed contour illustrated in Figure 4.32, consisting of the real axis from 
—R to +R and the semicircle T, of radius R, in the upper half z plane. Since z = x on the 
real axis, 


} Аа) йг = | fx) dx + | fiz) dz 


-R D 


Then, provided that lim; ,.. f. /(z) dz 2 0, taking R — ce gives 


} f(z) dz = | f(x) dx 


On the semicircular path I; z 2 R e? (0 « 0 < 


dz = jR e}? d0 


1T), giving 
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Example 4.38 


Solution 


and 


| fiz) dz = | f(R e) in e? qo 


0 


For this to tend to zero as R — ee, |f/(R e?)| must decrease at least as rapidly as R”, 
implying that the degree of the denominator of the rational function f(x) must be at least 
two more than the degree of the numerator. Thus, provided that this condition is 
satisfied, this approach may be used to calculate the infinite real integral f^, f(x) dx. 
Note that if f(x) is an even function of x then the same approach can also be used to 
evaluate fj f(x) dx, since if f(x) is even, it follows that 


| f(x) dx = 1 f(x) dx 


Using contour integration, show that 
dx 
[us 
2. (х +4) 


Consider the contour integral 


dz 
[з= 
Жет. 


where C is the closed semicircular contour shown in Figure 4.32. The integrand 
1/(z? + 4) has poles of order two at z = +2]. However, the only singularity inside the 
contour C is the double pole at z = 2j. From (4.38), 


1 
(z - 2j)° (z+2j)° 
-2 -2 


= lim ———— -ii 
291) (z 4 2jy ЕТ y» 








residue _ 


1 
atz=2j in id iG 2j) 





so, by the residue theorem, 


dz : Li 1 
} (Pay = 21)(-5) = ist 
c 





Since 


} dz -f dx | dz 
с( +4)” ie Pay dide Ау 


letting R — ee, and noting that the second integral becomes zero, gives 











оо 


} dz = dx Z ln 
eG AY ажар" 











z plane 


Figure 4.33 The 
unit-circle contour 
for evaluating 


2n r 
fs G(sin 0, cos 0) dO. 


Example 4.39 


Solution 
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Note that in this particular case we could have evaluated the integral without using 
contour integration. Making the substitution x — 2tan 0, dx — 2sec?0 d0 gives 





= л/2 2 п/2 
| к= | 2sec 8d8 _ | cos’ do = &[1їп 20 + Ө]? = &т 
(x + 4) -n/2 (4sec Ө) 


-п/2 


Type 2: Real integrals of the form J = J?"G(sin 0, cos 0) d0 where G is 
a rational function of sin 0 and cos 0 


We take z — e/?, so that 
$ I 1 1 1 
Ө==—|2-- Ө=;|2+- 
sin zl 1) ; cos 3 Ё 1) 


dz-jed6, or d9- € 


JZ 


and 


On substituting back, the integral / becomes 


[= } f(z) dz 


where C is the unit circle |z| = 1 shown in Figure 4.33. 


Using contour integration, evaluate 
2n 
y dé 
о 2+ С05 0 


Таке z = e!, so that 


1 


cos = 3(2+4), 4ё= ® 
2 


12 


On substituting, the integral becomes 


I- } д2 _ 2 } dz 
с1#{2+;(®+ 1] JJez-44z41 
where C is the unit circle |z| = 1 shown in Figure 4.33. The integrand has singularities at 
z+4z7+1=0 


that is, at z = —2 + /3. The only singularity inside the contour C is the simple pole at 
z — —2 * 43. From (4.37), 


residue at z 2 —2 4 43 


I 


= lm ?gmsEmm .—]À |.21.l 
з |) (2+2 -– {3)(2+2 + 3) Ј 2/3 jy3 


2—24 {. 
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so, by the residue theorem, 


I= 2ni( +) в 
jy3 43 


Thus 


2n 
40__ _2т 
5 2+cos@ {3 


4.6.6 Exercises 


60 Evaluate the integral 


zdz 
cz *1 


where C is 





(a) the circle |z| = i (b) the circle |z| 2 2 


61 Evaluate the integral 
PES 312-2 {> 
c г +92 
where C is 


(a) the circle |z| 2 1 (b) the circle |z| 2 4 


62 Calculate the residues at all the poles of the function 


Дә) = (22+2)(22 +4) 
(2 +1)(2 +6) 


Hence calculate the integral 


} f(z)dz 
© 


where C is 


(a) the circle |z| 2 2 
(c) the circle |z| 2 4 


(b) the circle |z — j| 2 1 


63 Evaluate the integral 


dz 
201 +2) 


where C is 


(a) the circle |z| = i (b) the circle |z| 2 2 


64 Using the residue theorem, evaluate the following 
contour integrals: 


«d Gz +2)dz 
c G- DG +4) 


(i) thecircle |z- 2| 2 2 


where C is 
(ii) the circle |z| = 4 


(b) } (22 - 22) 2 
с (2+1)(22+ 4) 


(1) the circle |z| 2 3 


where Cis 4 `. . 
(ii) the circle |z +j| = 2 


) dz 
с (@+1)°(@- 1)(@-2) 


(i) the circle |z| 2 


where Cis J (ii) thecircle |z -1| 2 1 
(їп) the rectangle with vertices 
at +j, 3 +j 


od (z-1)dz 
c G - 4241» 


(i) the circle |z| = } 
_ | (ii) the circle E +| =2 
where C is 
(iii) the triangle with vertices 


at -i4j-2-)3-j0 
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65 Using a suitable contour integral, evaluate the d 2 
; : x dx 
following real integrals: ( ре 
st +1)(х -2x-2) 
dx dx 7 dx 
(a) (b) h = 
X +x+l +1) (8) 3- Sues (h) aX #1 
d 
(c) | е G) 
o x +1)(х +4) (2 Ax e 5 с 
2n 2n 
cos 30 4 d8 . cosQü — 
| 5-4cos0 (e) | 5+4 5100 of Iid 


4.7 Engineering application: RGF Sta emai cris 


In the circuit shown in Figure 4.34 we wish to find the variation in impedance Z and 
" admittance Y as the capacitance C of the capacitor varies from 0 to ee. Here 
Z 


Z R 
Writing 
— 

Figure 4.34 1_1+j@Cr 
AC circuit of Z R 
Section 4.7. 

we clearly have 

= R (4.50) 
1+j@CR 


Equation (4.50) can be interpreted as a bilinear mapping with Z and C as the two vari- 
ables. We examine what happens to the real axis in the C plane (C varies from 0 to © 
and, of course, is real) under the inverse of the mapping given by (4.50). Rearranging 
(4.50), we have 


= (4.51) 
joRZ 
Taking Z = x + jy 
К х+]у-К _(х+]у-Ку(у+]х (4.52) 
joR(x*jy) oR(y-jx) oR(x 4 y) 
Equating imaginary parts, and remembering that C is real, gives 
02x -y-Rx (4.53) 


which represents a circle, with centre at (4R, 0) and of radius 2 R. Thus the real axis in 
the C plane is mapped onto the circle given by (4.53) in the Z plane. Of course, C is 
positive. If C = 0, (4.53) indicates that Z = R. The circuit of Figure 4.34 confirms 
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Figure 4.35 Mapping 
for the impedance Z. 





C plane Z plane 








Figure 4.36 Mapping 


for the admittance Y. С оо 


C increasing 





that the impedance is R in this case. If C — eo then Z — 0, so the positive real axis in 
the plane is mapped onto either the upper or lower half of the circle. Equating real parts 
in (4.52) gives 


C2 —X— 
ex + у?) 
so C > 0 gives y < 0, implying that the lower half of the circle is the image in the 
Z plane of the positive real axis in the C plane, as indicated in Figure 4.35. A diagram 
such as Figure 4.35 gives an immediate visual impression of how the impedance 
Z varies as C varies. 
The admittance Y= 1/Z is given by 


y=t+joc 
R 


which represents a linear mapping as shown in Figure 4.36. 


4.8 Engineering application: (Sema re laitealem ile 


In this section we discuss two engineering applications where use is made of the 
properties of harmonic functions. 


4.8.1 A heat transfer problem 


We saw in Section 4.3.2 that every analytic function generates a pair of harmonic 
functions. The problem of finding a function that is harmonic in a specified region 
and satisfies prescribed boundary conditions is one of the oldest and most important 
problems in science-based engineering. Sometimes the solution can be found by means 


Temperature 0 ?C 


Temperature 100°C 


Figure 4.37 
Schematic diagram of 
heat transfer problem. 


r=0.3R 


Figure 4.38 
The mapping 
м = (2 – 3)/(32 – 1). 
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of a conformal mapping defined by an analytic function. This, essentially, is a con- 
sequence of the ‘function of a function’ rule of calculus, which implies that every 
harmonic function of x and y transforms into a harmonic function of u and v under the 


mapping 
м=и+јо= (+ ју) = 0) 


where f(z) is analytic. Furthermore, the level curves of the harmonic function in the 
z plane are mapped onto corresponding level curves in the w plane, so that a harmonic 
function that has a constant value along part of the boundary of a region or has a zero 
normal derivative along part of the boundary is mapped onto a harmonic function with 
the same property in the w plane. 

For heat transfer problems the level curves of the harmonic function correspond to 
isotherms, and a zero normal derivative corresponds to thermal insulation. To illustrate 
these ideas, consider the simple steady-state heat transfer problem shown schematic- 
ally in Figure 4.37. There is a cylindrical pipe with an offset cylindrical cavity through 
which steam passes at 100°C. The outer temperature of the pipe is 0?C. The radius of 
the inner circle is + of that of the outer circle, so by choosing the outer radius as the 
unit of length the problem can be stated as that of finding a harmonic function 7(x, y) 
such that 


2 2 
ата 2T. 0 
дх ду 
in the region between the circles |z| 2 1 and |z — 0.3| 2 0.3, and T= 0 опр |2| = 1 апа 
T 2100 on |z — 0.3| 2 0.3. 
The mapping 


_ 2-3 


32-1 


transforms the circle |z| = 1 onto the circle | w| 2 1 and the circle |z — 0.3| = 0.3 onto 
the circle | w| 2 3 as shown in Figure 4.38. Thus the problem is transformed into the 
axially symmetric problem in the w plane of finding a harmonic function 7(u, v) such 
that 7(u, v) 2 100 on |w| 2 1 and 7(u, v) 2 0 on |w| 2 3. Harmonic functions with such 
axial symmetry have the general form 


Т(и, 0) = Аа (и? +12) + В 


where A and B are constants. 





І 
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Here we require, in addition to the axial symmetry, Һа 7(и, и) = 100 опи? +12 = 1 
and 7(u, v) 2 0 on i? - i? - 9. Thus B — 100 and А = 100 In 9, and the solution on the 
w plane is 


2,2 
Tus 100[1 -ul tv)] 


We need the solution on the z plane, which means in general we have to obtain u and 
v in terms of x and y. Here, however, it is a little easier, since u? + v? = |w}? and 





|>]? = EG[-gL.-w1:nu- 
32-11 [32-1] (3х- 1)? +9у 
Thus 
. 100 E _ 2/2 27 _ тү 2 
аео, 


4.8.2 Current in a field-effect transistor 


The fields (£,, E,) in an insulated-gate field-effect transistor are harmonic conjugates 
that satisfy a nonlinear boundary condition. For the transistor shown schematically in 
Figure 4.39 we have 


2E, OE, ^ QE -oE, 
дх ду’ dy ox 


with conditions 


E,—0 onthe electrodes 





E, (6+ A zad on the channel 
" Ah 2це,ё, 


E> -5 as x—-e (0«yc«h) 


E, 5s as x70 (0 <у< л) 


where J^ is a constant with dimensions of potential, A is the insulator thickness, Z is the 
current in the channel, which is to be found, р, € and e, have their usual meanings, and 


the gate potential V, and the drain potential /; are taken with respect to the source 
potential. 


Figure 4.39 

(a) Schematic diagram 
for an insulated-gate 
field-effect transistor; 
(b) an appropriate 

coordinate system for ^ ——————————5: 4- 2 4 
the application. Source electrode Channel Drain electrode 


Gate electrode 
—M— 





(a) (b) 
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The key to the solution of this problem is the observation that the nonlinear boundary 
condition 


2E, (5^ = = 
h HEE; 





contains the harmonic function (now of £, and Е,) 


HE, E) - 2 (E, +“) 


A harmonic conjugate of H is the function 


V. 2 
GG, E)- (E 723] -E 


Since Æ, and E, are harmonic conjugates with respect to x and y, so are G and H. Thus 
the problem may be restated as that of finding harmonic conjugates G and H such that 


H=0 on the electrodes 





H=- 2 on the channel 
gey 


Е 2 
(88) а х= оу 


_ 2 
G5 (n as x>- (0«y«h) 


Using the sequence of mappings shown in Figure 4.40, which may be composed into 
the single formula 


bz 2 
wae -a 
ae”-1 
where a = e"^ and b = n/h, the problem is transformed into finding harmonic-conjugate 


functions G and H (on the w plane) such that 





H=0 on v=0 (u>0) (4.54) 
H=-—— on v=0 (we) (4.55) 
HEE, 
= 2 
G- (S at w-e* (4.56) 
2 2 
б= (At) at w-1 (4.57) 


The conditions (4.54), (4.55) and (4.57) are sufficient to determine H and G completely 


i=  larg(w) 


TEE, 





G= Ualwl , (“2 Ki 2 
TEE, h 
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Figure 4.40 y 
Sequence of mappings 
to simplify the 
problem. z plane 
z—ziL 
z —> nzíh 
Zoe 
292-1 


z 2 zl(a? - l) 





zo ll 





z—l-z 


w plane 





while the condition (4.56) determines the values of 7 
E£, 
I= — QV, - 2V, 4 VV, 


This example shows the power of complex variable methods for solving difficult 
problems arising in engineering mathematics. The following exercises give some 
simpler examples for the reader to investigate. 


66 


67 


68 
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4.8.3 Exercises 


Show that the transformation w 2 l/z, w- u * jv, 
z-x- jy, transforms the circle x? +)” = 2ax in the 
z plane into the straight line w = 1/2a in the w plane. 
Two long conducting wires of radius a are placed 
adjacent and parallel to each other, so that their 
cross-section appears as in Figure 4.41. The 

wires are separated at O by an insulating gap of 
negligible dimensions, and carry potentials £V, 
as indicated. Find an expression for the potential 69 
at a general point (x, y) in the plane of the cross- 
section and sketch the equipotentials. 





Figure 4.41 Conducting wires of Exercise 66. 


Find the images under the mapping 


-z*l 
l-z 
z - x * jy, of 


(a) the points A(-1, 0), B(0, 1), C(25, 4) and 
D(3, 0) in the z plane, 

(b) the straight line y = 0, 

(c) the circle x? +y = 1. 


Illustrate your answer with a diagram showing the 
z and w planes and shade on the w plane the region 
corresponding to x? 4 < 1. 

A semicircular disc of unit radius, [(x, y): 
xX +y <1,y = 0], has its straight boundary at 70 
temperature 0 °C and its curved boundary at 100°C. 
Prove that the temperature at the point (x, y) is 


T= ar! ( —HM— ) 


T ages у 
(a) Show that the function 


G(x, y) 7 2x(1 — y) 
satisfies the Laplace equation and construct 
its harmonic conjugate H(x, y) that satisfies 
(0, 0) = 0. Hence obtain, in terms of z, where 
2= х + jy, the function F such that W = F(z) 
where W = G + jH. 


(b) Show that under the mapping w = Inz, the 
harmonic function G(x, y) defined in (a) is 
mapped into the function 


G(u, v) ^ 2e"cosv — e" sin2v 
Verify that G(u, v) is harmonic. 


(c) Generalize the result (b) to prove that under 
the mapping w = f(z), where f’(z) exists, a 
harmonic function of (x, y) is transformed 
into a harmonic function of (u, v). 


Show that if w = (z + 3)/(z — 3), w = u + jv, 
z =x + jy, the circle u? +1? = k? in the w plane 
is the image of the circle 
2 
xX +y +6——x+9=0 (К #1) 
l-k 
in the z plane. 
Two long cylindrical wires, each of radius 

4 mm, are placed parallel to each other with their 
axes 10 mm apart, so that their cross-section 
appears as in Figure 4.42. The wires carry potentials 
+V, as shown. Show that the potential V(x, y) at the 
point (x, y) is given by 


V- D. {и [(х +3) +7] - In[x- 39 y) 
In4 


Figure 4.42 Cylindrical wires of Exercise 69. 


Find the image under the mapping 


w - id-2 


7 1+2 
z=x+jy,w=u + jv, of 


(a) the points A(1, 0), B(0, 1), C(0, —1) in the 
z plane, 

(b) the straight line y = 0, 

(c) the circle x? +)? = 1. 


A circular plate of unit radius, [(x, y): 2° +)" < 1], 
has one half (with y > 0) of its rim, x7 +)” = 1, at 

temperature 0 °C and the other half (with y < 0) at 
temperature 100 °C. Using the above mapping, prove 
that the steady-state temperature at the point (x, y) is 


2 2 
Т = 100 У — ) 
T 2y 
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71 The problem shown schematically in 
Figure 4.43 arose during a steady-state heat 
transfer investigation. T is the temperature. 
By applying the successive mappings 





T=0 


Figure 4.43 Schematic representation of 
Exercise 71. 


EXPE w — Inz, 
z-j4 


21 


show that the temperature at the point (x, y) in the 
shaded region in the figure is given by 


2 2 
Т(х, у) = 20м etry try s 
In 3 x +(4-y) 





72 The functions 
Figure 4.44 Mappings of Exercise 72. 





w=Zzt 1 e wS 2+1 
2 2-1 
perform the mappings shown in Figure 4.44. A long 
bar of semicircular cross-section has the temperature YA 
of the part of its curved surface corresponding to | 
the arc PQ in Figure 4.45 kept at 100?C while the Q E 
rest of the surface is kept at 0?C. Show that the 
temperature T at the point (x, y) is given by 
T= 108 farg(? +2 + 1) - arg - z 1] . | 
T Figure 4.45 Cross-section of bar of Exercise 72. 


4.9 Review exercises (1-24) 


1 Find the images of the following points under the 2 Under each of the mappings given in Review 
mappings given: exercise 1, find the images in the w plane of the 

Оаа ее two straight lines 

(b) z21-j2 under w=j3z+j+l (а) у= 2х 

(с) 2=1 under у= 1(1- )2+1(1 +) (Ы) х+у=1 


(Ф 2 = 2 under у= 2 (1 )2+ 1(1 +) in the z plane, z = x + ју. 


The linear mapping w = az + B, where œ and B are 8 
complex constants, maps the point z = 2 —j in the 

z plane to the point w = 1 in the w plane, and the 

point z = 0 to the point w =3 +j. 


(a) Determine сапа В. 9 
(b) Find the region in the w plane corresponding to 
the left half-plane Re(z) « 0 in the z plane. 
(c) Find the region in the w plane corresponding to 
the circular region 5|z| « 1 in the z plane. 
(d) Find the fixed point of the mapping. 


Map the following straight lines from the 
z plane, z = x + jy, to the w plane under the 
inverse mapping w = j/z: 


(а) х=у+1 10 
(6) у= 3х 
(c) the line joining A(1 + j) to B(2 + j3) in the 
z plane 
Oe 


In each case sketch the image curve. 





Two complex variables w and z are related by the 11 
mapping 
aal 
E 
ZEN 


Sketch this mapping by finding the images 
in the w plane of the lines Re(z) = constant and 
Im(z) = constant. Find the fixed points of the 


mapping. 


The mapping 5 
1 


takes points from the z plane to the w plane. Find 
the fixed points of the mapping, and show that the 
circle of radius r with centre at the origin in the 

z plane is transformed to the ellipse 


ur Y vr Y 2 
ey 
aa rl 


in the w plane, where w = u + jv. Investigate what 
happens when r = 1. 








13 


Find the real and imaginary parts of the complex 
function w =z’, and verify the Cauchy-Riemann 
equations. 
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Find a function v(x, y) such that, given 
u(x, y) = x sinx cosh y — y cos x sinh y 


J(Z)=u + jv is an analytic function ofz, f(0)= 0. 


Find the bilinear transformation that maps the three 
points z = 0, j and id + j) in the z plane to the 
three points w = œ, —j and 1 — j respectively in the 
w plane. Check that the transformation will map 


(a) the lower half of the z plane onto the upper 
half of the w plane 

(b) the interior of the circle with centre z = ji 
and radius т in the z plane onto the half-plane 
Im(w) < —1 in the w plane. 


Show that the mapping 
gi 
= + — 
йк (4 ae 


where z = x + jy and & — R e? maps the circle 
R = constant in the ¢ plane onto an ellipse in the 
z plane. Suggest a possible use for this mapping. 


Find the power series representation of the 
function 


1 
l+ 





in the disc |z| < 1. Deduce the power series for 
1 
(127)? 


valid in the same disc. 





Find the first four non-zero terms of the Taylor 
series expansion of the following functions about 
the point indicated, and determine the radius of 
convergence of each: 


I 











1-2 = = 
eo) Oe 
() — €-) 


Find the radius of convergence of each Taylor 
series expansion of the following function about the 
points indicated, without finding the series itself: 


at the роіпіѕ 2 = 1, –1, 1+], 1 к and 2 4 j3. 
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14 


15 


16 


I 


18 


Illo 


20 
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Determine the Laurent series expansion of the 
function 
1 
р 
(zac 


about the points (a) z = 0 and (b) z= 1, and 
determine the region of validity of each. 





f(z) = 


Find the Laurent series expansion of the function 
E 1 
Л) = ееш 
ПЕ 


about (a) z 4 0, (b) z « 1 and (c) z 7 ee, indicating 
the range of validity in each case. (Do rot find terms 
explicitly; indicate only the form of the principal 
part.) 


Find the real and imaginary parts of the functions 
(a) e'sinhz (b) cos2z 


(ше ПОТ 


Determine whether the following mappings are 


conformal, and, if not, find the non-conformal points: 


(а) w=4 


N 


(Б) у= 22? + 322 + 6(1 – ј)2+ 1 


(с) w= 642+ +. 
2 


Consider the mapping w = cosz. Determine the points 
where the mapping is not conformal. By finding the 
images in the w plane of the lines x = constant and 
y = constant in the z plane (z = x + jy), draw the 
mapping similarly to Figures 4.14 and 4.18. 


Determine the location of and classify the 
singularities of the following functions: 


sin z D 1 
DEM eae 
(с) 241 (@) sechz 

p 


(e) sinhz — (f) sin( 4) (g) z' 


Find the residues of the following functions at the 
points indicated: 


23 


24 


2z 








e em cos Z zi 
D TIN c 

tanz — il 2 ==! 
(© 27 @=im @ со (с= -8) 


Find the poles and zeros, and determine all the 
residues, of the rational function 


fiz) = (z - 1) 43245) 


z(z^ 41) 
Determine the residue of the rational function 


z!-«6z - 302° 
(z-1-j)° 


Evaluate the following contour integrals along 
the circular paths indicated: 


(a) ÉL 
ee Veo 


o$ a +1)(22 Е 
(2° + 9)(2° +4) 


bat а 
(c) where 

bm (ii) C is |z| 22 
d 

n б = бузуб +) 


where | 


where C is |z| 2 2 


where C is |z| = 


(1) Сл (= 

(ШС mS 
3 

(e) } a 

а) (а) 


(f) } ке dz , where 
e A2) (6-3) 


, Where C is |z -j| 7 1 
үе 


(Ж ОЕ” 


Using a suitable contour integral, evaluate the 
following real integrals: 


2 
Cra doo 


( а aa 
a |. (x EY Ge E232) 


(b) E (c) | 
QX 16 й 


2n 
cos 20 d0 
Ө | 5—4cos 0 


sin 0 dO 
5 +4 соѕ Ө 
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M 


Introduction 


Laplace transform methods have a key role to play in the modern approach to the 
analysis and design of engineering systems. The stimulus for developing these methods 
was the pioneering work of the English electrical engineer Oliver Heaviside (1850— 
1925) in developing a method for the systematic solution of ordinary differential 
equations with constant coefficients. Heaviside was concerned with solving prac- 
tical problems, and his method was based mainly on intuition, lacking mathematical 
rigour: consequently it was frowned upon by theoreticians at the time. However, 
Heaviside himself was not concerned with rigorous proofs, and was satisfied that his 
method gave the correct results. Using his ideas, he was able to solve important 
practical problems that could not be dealt with using classical methods. This led to 
many new results in fields such as the propagation of currents and voltages along 
transmission lines. 

Because it worked in practice, Heaviside's method was widely accepted by engineers. 
As its power for problem-solving became more and more apparent, the method attracted 
the attention of mathematicians, who set out to justify it. This provided the stimulus for 
rapid developments in many branches of mathematics including improper integrals, 
asymptotic series and transform theory. Research on the problem continued for many 
years before it was eventually recognized that an integral transform developed by the 
French mathematician Pierre Simon de Laplace (1749-1827) almost a century before 
provided a theoretical foundation for Heaviside’s work. It was also recognized that the 
use of this integral transform provided a more systematic alternative for investigating 
differential equations than the method proposed by Heaviside. It is this alternative 
approach that is the basis of the Laplace transform method. 

We have already come across instances where a mathematical transformation has 
been used to simplify the solution of a problem. For example, the logarithm is used to 
simplify multiplication and division problems. To multiply or divide two numbers, we 
transform them into their logarithms, add or subtract these, and then perform the 
inverse transformation (that is, the antilogarithm) to obtain the product or quotient of 
the original numbers. The purpose of using a transformation is to create a new domain 
in which it is easier to handle the problem being investigated. Once results have been 
obtained in the new domain, they can be inverse-transformed to give the desired results 
in the original domain. 

The Laplace transform is an example of a class called integral transforms, and it 
takes a function f(t) of one variable ¢ (which we shall refer to as time) into a function 
F(s) of another variable s (the complex frequency). Another integral transform widely 
used by engineers is the Fourier transform, which is dealt with in Chapter 8. The 
attraction of the Laplace transform is that it transforms differential equations in the t 
(time) domain into algebraic equations in the s (frequency) domain. Solving differ- 
ential equations in the f domain therefore reduces to solving algebraic equations 
in the s domain. Having done the latter for the desired unknowns, their values as 
functions of time may be found by taking inverse transforms. Another advantage of 
using the Laplace transform for solving differential equations is that initial conditions 
play an essential role in the transformation process, so they are automatically 


Figure 5.1 Schematic 
representation of a 
system. 
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u(t) x(t) 
SYSTEM 
Input or Output or 


excitation response 


incorporated into the solution. This constrasts with the classical approach con- 
sidered in Chapter 10 of the companion text Modern Engineering Mathematics, 
where the initial conditions are only introduced when the unknown constants of 
integration are determined. The Laplace transform is therefore an ideal tool for solving 
initial-value problems such as those occurring in the investigation of electrical circuits 
and mechanical vibrations. 

The Laplace transform finds particular application in the field of signals and linear 
systems analysis. A distinguishing feature of a system is that when it is subjected to 
an excitation (input), it produces a response (output). When the input u(t) and output 
x(t) are functions of a single variable ¢, representing time, it is normal to refer to them 
as signals. Schematically, a system may be represented as in Figure 5.1. The problem 
facing the engineer is that of determining the system output x(f) when it is subjected to 
an input u(t) applied at some instant of time, which we can take to be t= 0. The relation- 
ship between output and input is determined by the laws governing the behaviour of 
the system. If the system is linear and time-invariant then the output is related to the 
input by a linear differential equation with constant coefficients, and we have a standard 
initial-value problem, which is amenable to solution using the Laplace transform. 

While many of the problems considered in this chapter can be solved by the classical 
approach, the Laplace transform leads to a more unified approach and provides the 
engineer with greater insight into system behaviour. In practice, the input signal u(f) 
may be a discontinuous or periodic function, or even a pulse, and in such cases the 
use of the Laplace transform has distinct advantages over the classical approach. Also, 
more often than not, an engineer is interested not only in system analysis but also in 
system synthesis or design. Consequently, an engineer's objective in studying a sys- 
tem's response to specific inputs is frequently to learn more about the system with a 
view to improving or controlling it so that it satisfies certain specifications. It is in this 
area that the use of the Laplace transform is attractive, since by considering the system 
response to particular inputs, such as a sinusoid, it provides the engineer with powerful 
graphical methods for system design that are relatively easy to apply and widely used 
in practice. 

In modelling the system by a differential equation, it has been assumed that both 
the input and output signals can vary at any instant of time; that is, they are functions 
of a continuous time variable (note that this does not mean that the signals themselves 
have to be continuous functions of time). Such systems are called continuous-time 
systems, and it is for investigating these that the Laplace transform is best suited. 
With the introduction of computer control into system design, signals associated with 
a system may only change at discrete instants of time. In such cases the system is said 
to be a discrete-time system, and is modelled by a difference equation rather than a 
differential equation. Such systems are dealt with using the z transform considered in 
Chapter 6. 
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The Laplace transform 


5.2.1 Definition and notation 


We define the Laplace transform of a function f(f) by the expression 
ФК} = | e"f(n) dr (5.1) 
0 
where s is a complex variable and e *'is called the kernel of the transformation. 
It is usual to represent the Laplace transform of a function by the corresponding 


capital letter, so that we write 


со 


XL fa) = F(s)- | EKA) dt (5.2) 


0 


An alternative notation in common use is to denote .Z f(r)) by f(s) or simply f. 
Before proceeding, there are a few observations relating to the definition (5.2) worthy 
of comment. 


(a) The symbol £ denotes the Laplace transform operator; when it operates on a 
function (f), it transforms it into a function F(s) of the complex variable s. We 
say the operator transforms the function f(t) in the ¢ domain (usually called the 
time domain) into the function F(s) in the s domain (usually called the complex 
frequency domain, or simply the frequency domain). This relationship is 
depicted graphically in Figure 5.2, and it is usual to refer to f(f) and F(s) as a 
Laplace transform pair, written as { f(t), F(s)}. 


Figure 5.2 Lt} 
The Laplace transform 
operator. 
t domain s domain 
(time domain) (frequency domain) 


(b) Because the upper limit in the integral is infinite, the domain of integration is 
infinite. Thus the integral is an example of an improper integral, as introduced 
in Section 9.2 of Modern Engineering Mathematics; that is, 


| EK) dr = lim | e"f(t) dt 


This immediately raises the question of whether or not the integral converges, an 
issue we shall consider in Section 5.2.3. 


(c) Because the lower limit in the integral is zero, it follows that when taking the 
Laplace transform, the behaviour of /(f) for negative values of t is ignored or 


Figure 5.3 

Graph of f(t) and 

its causal equivalent 
function. 


(d) 
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suppressed. This means that F(s) contains information on the behaviour of (f) 
only for ¢ = 0, so that the Laplace transform is not a suitable tool for investigating 
problems in which values of f(t) for t « 0 are relevant. In most engineering applica- 
tions this does not cause any problems, since we are then concerned with physical 
systems for which the functions we are dealing with vary with time ¢. An attribute 
of physical realizable systems is that they are non-anticipatory in the sense 
that there 1s no output (or response) until an input (or excitation) is applied. 
Because of this causal relationship between the input and output, we define a 
function f(t) to be causal if f(r) 2 O0 (t « 0). In general, however, unless the 
domain is clearly specified, a function f(f) is normally intepreted as being defined 
for all real values, both positive and negative, of t. Making use of the Heaviside 
unit step function H(t) (see also Section 5.5.1), where 


но = |0 (1 < 0) 
1 (220) 
we have 
0 (t<0) 
t)H(t) = 
PU Us (t= 0) 


Thus the effect of multiplying /(4) by H(A) is to convert it into a causal function. 
Graphically, the relationship between f(f) and f(f)H(f) 1s as shown in Figure 5.3. 


ЦО) FOHA) 


It follows that the corresponding Laplace transform F(s) contains full 
information on the behaviour of A(AH(A. Consequently, strictly speaking one 
should refer to { f(NA(A, F(s)} rather than { f(), F(s)} as being a Laplace trans- 
form pair. However, it is common practice to drop the H(t) and assume that we 
are dealing with causal functions. 


If the behaviour of f(t) for t < 0 is of interest then we need to use the alternative 
two-sided or bilateral Laplace transform of the function (A), defined by 


Lsi fA} = | EAA) dt (5.3) 


The Laplace transform defined by (5.2), with lower limit zero, is sometimes 
referred to as the one-sided or unilateral Laplace transform of the function f(1). 
In this chapter we shall concern ourselves only with the latter transform, and refer 
to it simply as the Laplace transform of the function f(t). Note that when f(f) is a 
causal function, 


Lyi fO} = ALO} 
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5.2.2 


Example 5.1 


Solution 


(e) Another issue concerning the lower limit of zero is the interpretation of f(0) when 
f(t) has a peculiarity at the origin. The question then arises as to whether or not 
we should include the peculiarity and take the lower limit as 0” or exclude it and 
take the lower limit as 0* (as conventional 0" and 0* denote values of f just to the 
left and right of the origin respectively). Provided we are consistent, we can take 
either, both interpretations being adopted in practice. In order to accommodate 
any peculiarities that may occur at ¢ = 0, such as an impulse applied at t = 0, we 
take O` as the lower limit and interpret (5.2) as 


со 


Lf fu) = F(s) = | e"f(n dr (5.4) 


o- 


We shall return to this issue when considering the impulse response in Section 5.5.8. 


Transforms of simple functions 


In this section we obtain the Laplace transformations of some simple functions. 


Determine the Laplace transform of the function 


№) = с 


where c is a constant. 


Using the definition (5.2), 


> Т 
(с) -| e"cdr- im | ec dt 
T 
0 0 


T 
= lim Г е” =£ (: - lim e”) 
Te S S Te 
0 
Taking s = с + jo, where o and o are real, 


lim e" = lim(e ^9?" — lim e?" ( cos oT j sin oT) 

T T Te 

A finite limit exists provided that ø = Re(s) > 0, when the limit is zero. Thus, provided 
that Re(s) > 0, the Laplace transform is 


е) = : Re(s) > 0 


so that 
S(t) =e 
cp Re(s) > 0 (5.5) 
F(s) = s 


constitute an example of a Laplace transform pair. 


Example 5.2 


Solution 


Example 5.3 


Solution 
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Determine the Laplace transform of the ramp function 


fO=t 
From the definition (5.2), 


Tc 


ee T 
Lit} =| e"tdr- im | e"tdt 
0 0 











“ol 


2 
Т 2 T— S Т S 


= T Tei e 
= Ша |е" е = —-lim - lim 
5 


0 
Following the same procedure as in Example 5.1, limits exist provided that 
Re(s) > 0, when 

ls —sT 


lim = lim 
T S Т-° S 








2 = 0 
Thus, provided that Re(s) > 0, 
1 
t 
s 


giving us the Laplace transform pair 


f(t) =t 
mee 1 Re(s) > 0 (5.6) 
S 


Determine the Laplace transform of the one-sided exponential function 


Хд = е" 


The definition (5.2) gives 


T 


ee T 
Hie t= | ее“ г = іт | ee de 
0 


20 7l pL.26-5ur 1 ( . zin 
- = Т 
m sale lo s-k Te 


Writing s = o + jæ, where o and @ are real, we have 


lime"? = lime "Fe 


To To 


joT 


If k is real, then, provided that o = Re(s) > &, the limit exists, and is zero. If k is 
complex, say k= a + jb, then the limit will also exist, and be zero, provided that o > a 
(that is, Re(s) > Re(k)). Under these conditions, we then have 


ен} = — 


1 
s-k 
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giving us the Laplace transform pair 


га) = е“ 
21 Re(s) > Re(k) (5.7) 
WI 


Example 5.4 Determine the Laplace transforms of the sine and cosine functions 
f(t) ^ sin at, g(t) = cos at 


where a is a real constant. 


Solution Since 
e = cos at + j sin at 
we may write 
fÀ = sin at = ше!“ 
g(À = cos at = Ве е!“ 


Using this formulation, the required transforms may be obtained from the result 


Fie} =—, Re(s) > Re(k) 


1 
s-k 


of Example 5.3. 

Taking k = ja in this result gives 
Pte} =——, Re(s) > 0 
s-ja 


or 


Spe - $375. Res) > 0 
S 34a 


Thus, equating real and imaginary parts and assuming s is real, 





{зїп аг} = Im Le} = —4 
Sta 


{соз аг} = Ке Fe} = -= 
s +a 


These results also hold when s is complex, giving us the Laplace transform pairs 








L{sinat}=—*—, Re(s) > 0 (5.8) 
S ta 
S[cosat] 2 ——;, Re(s) 0 (5.9) 


S ta 
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(m) In MATLAB, using the Symbolic Toolbox, the Laplace transform of a function f(t) 
is obtained by entering the commands 


Sa cM M 
laplace(f(t)) 


with the purpose of the first command being that of setting up s and f as symbolic 
variables. 

To search for a simpler form of the symbolic answer enter the command 
simple (ans). Sometimes repeated use of this command may be necessary. To display 
the answer in a format that resembles typeset mathematics, use is made of the pretty 
command. Use of such commands will be illustrated later in some of the examples. 

If the function f(t) includes a parameter then this must be declared as a symbolic 
term at the outset. For example, the sequence of commands 


Swine S i 4 
laplace(sin(a*t) ) 
gives, as required, 
аше аии заи) 
as the Laplace transform of sin (at). 
Use of MAPLE is almost identical to the MATLAB Symbolic Math Toolbox 
except for minor semantic differences. However, when using MAPLE the integral 


transform package must be invoked using inttrans and the variables ¢ and s must 
be specified explicitly. For instance the commands 


with(inttrans): 
laplace(sin(a*t),t,s); 


return the transform as 


a 





2 2 
Sita 


5.2.3 Existence of the Laplace transform 


Clearly, from the definition (5.2), the Laplace transform of a function f(t) exists if and 
only if the improper integral in the definition converges for at least some values of s. 
The examples of Section 5.2.2 suggest that this relates to the boundedness of the func- 
tion, with the factor e™ in the transform integral acting like a convergence factor in 
that the allowed values of Re(s) are those for which the integral converges. In order 
to be able to state sufficient conditions on f(t) for the existence of L{ f(t)}, we first 


introduce the definition of a function of exponential order. 


Definition 5.1 


A function f(t) is said to be of exponential order as t — ce if there exists a real 
number o and positive constants M and 7 such that 


IOI < Me” 


fon allir = 
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Example 5.5 


Example 5.6 


Solution 


What this definition tells us is that a function f(£) is of exponential order if it does not 
grow faster than some exponential function of the form M e?'. Fortunately most functions 
of practical significance satisfy this requirement, and are therefore of exponential order. 
There are, however, functions that are not of exponential order, an example being e, 
since this grows more rapidly than M e?' as t — œ whatever the values of M and o. 


The function f(f) = e* is of exponential order, with o — 3. 


Show that the function f(A = P? (t 2 0) is of exponential order. 


Since 
“=I URIO RIOT Hon 

it follows that for any a > 0 

6 


23 
a 


Р < е 


so that /° is of exponential order, with o > 0. 


It follows from Examples 5.5 and 5.6 that the choice of o in Definition 5.1 is not 
unique for a particular function. For this reason, we define the greatest lower bound o, 
of the set of possible values of ø to be the abscissa of convergence of f(t). Thus, in the 
case of the function f(f) 2 e", o, — 3, while in the case of the function f) = £^, о, = 0. 

Returning to the definition of the Laplace transform given by (5.2), it follows that 
if At) is a continuous function and is also of exponential order with abscissa of 
convergence бу, so that 


I0) « Me", с> о, 


then, taking T = 0 in Definition 5.1, 


< | | e 
0 


Writing s = o + jæ, where ø and æ are real, since |e 7| 2 1, we have 





| fc] di 





Role | | e" fr) di 


le | == le™| jei] == le = ew 


so that 


со 


|F(s)| « | E” | AA | dt = M | e"e"d, o, o, 
0 


0 
-(o-ogt 
-u| gu 
0 


Figure 5.4 

Region of convergence 
for Z(f(t)); ©, is 

the abscissa of 
convergence for f(t). 


Theorem 5.1 


5.2.4 
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This last integral is finite whenever o = Re(s) > O}. Since O; can be chosen arbitrarily 
such that 0, — o, we conclude that F(s) exists for o > o,. Thus a continuous function 


f(t) of exponential order, with abscissa of convergence O, has a Laplace transform 


JA f()) -F(s) Re(s)- о, 


where the region of convergence is as shown in Figure 5.4. 


Im (s) 





(a) ec »0 (b) e, «0 


In fact, the requirement that f(t) be continuous is not essential, and may be relaxed 
to f(t) being piecewise-continuous, as defined in Section 8.8.1 of Modern Engineering 
Mathematics; that is, f(t) must have only a finite number of finite discontinuities, being 
elsewhere continuous and bounded. 

We conclude this section by stating a theorem that ensures the existence of a Laplace 
transform. 


Existence of Laplace transform 


If the causal function /(f) is piecewise-continuous on [0, cc] and is of exponential order, 
with abscissa of convergence O, then its Laplace transform exists, with region of con- 
vergence Re(s) — o, in the s domain; that is, 


oo 


JS f(t)) 2 F(s) -Í e"f(r)d, Re(s) > о, 


0 


end of theorem 


The conditions of this theorem are sufficient for ensuring the existence of the Laplace 
transform of a function. They do not, however, constitute necessary conditions for 
the existence of such a transform, and it does not follow that if the conditions are 
violated then a transform does not exist. In fact, the conditions are more restrictive than 
necessary, since there exist functions with infinite discontinuities that possess Laplace 
transforms. 


Properties of the Laplace transform 


In this section we consider some of the properties of the Laplace transform that will 
enable us to find further transform pairs { f(t), F(s)} without having to compute them 
directly using the definition. Further properties will be developed in later sections when 
the need arises. 
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Example 5.7 


Solution 


Property 5.1: The linearity property 


A fundamental property of the Laplace transform is its linearity, which may be stated 
as follows: 


If (f) and g(f) are functions having Laplace transforms and if o and В аге апу 
constants then 


Ll af(d) + Pg} = OLA f(D} + BAGO} 


As a consequence of this property, we say that the Laplace transform operator is 
a linear operator. A proof of the property follows readily from the definition (5.2), 
since 


оо 


LY af(t) +Bg(t)} = | LAKE) + Bg] e" dr 


0 


=| anne” d+ | Bg(t) e^ dt 


«| ft) e* dr e| g(t) e” dt 
= aX fO} + PPLA 


Regarding the region of convergence, if f(r) and g(t) have abscissae of convergence o; 
and o, respectively, and 0, > оу, O, > O, then 


лә < ме", [501 < ме” 
It follows that 


Iaf + POI & |] 1/00] +1811г(0] Io M, e% + |8 |M, e% 
« (Jo M, * 1B|M;) e? 


where O = max(o;, 05), so that the abscissa of convergence of the linear sum 
of(t) + Bg(t) is less than or equal to the maximum of those for f(t) and g(t). 

This linearity property may clearly be extended to a linear combination of any finite 
number of functions. 


Determine Z(3t 2e. 


Using the results given in (5.6) and (5.7), 


£p -l, Кеб) > 0 
S 


Le} = x Re(s) > 3 
= 
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so, by the linearity property, 
J (3t - 281 2 3Y(t) - 2 (e 





=342, Re(s) > max{0, 3} 
Ss s-3 
=34 2 ‚ Ке(з)>3 


с 


s- 


(E The answer can be checked using the commands 


MATLAB MAPLE 

syms s t with(inttrans): 

iliac crop атое 26 еее) я) 
pretty (ans) 


which return 
cu NEUE 
Re s? SEO 


Example 5.8 Determine 7Z(5 — 3t - Asin 2t — 6e"). 


Solution Using the results given in (5.5)-(5.8), 


5р = 2, Res) >0 94}= 1, Rels) > 0 


5 S 
f[sn21) - 2, Ве(з) >20 ер = 1, Ве(з) > 4 
s +4 s-4 





so, by the linearity property, 
F45 — 3t + 4 sin 2t — 6e} = V (51 — 3 (t) -- AV (sin2t) — 6Z(e* 
53 8 6 





-2-24-4 , Re(s) > max{0, 4} 
S sS gs +4 s-4 

= 5-36-86 ро) > 4 

55 5+4 5-4 


[=] Again this answer can be checked using the commands 
SA S 1c 
laplace(5 - 3*t - 4*sin(2*t) - 6*exp(4*t)) 
in MATLAB, or the commands 
with(inttrans): 
Шатен ЕД кет эл ME MN UST аа ЖЕ Є ү; 
in MAPLE. 
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Theorem 5.2 


Proof 


Example 5.9 


Solution 


The first shift property is another property that enables us to add more combinations 
to our repertoire of Laplace transform pairs. As with the linearity property, it will prove 
to be of considerable importance in our later discussions particularly when considering 
the inversion of Laplace transforms. 


Property 5.2: The first shift property 


The property is contained in the following theorem, commonly referred to as the first 
shift theorem or sometimes as the exponential modulation theorem. 


The first shift theorem 


Iff( is a function having Laplace transform F(s), with Re(s) > о, then the function 
e“f(t) also has a Laplace transform, given by 


JZe"f(t)) 2 F(s—a), Re(s)- o,- Re(a) 


A proof of the theorem follows directly from the definition ofthe Laplace transform, since 


оо оо 


SNA E dt = | fu e dt 


0 


4e = | 


0 


Then, since 


oo 


Li f(D} = F(s) =| frye" dt, Re(s) > о. 


0 


we see that the last integral above is in structure exactly the Laplace transform of f(t) 
itself, except that s — a takes the place of s, so that 


JZ(e"f(ft)) 2F(s—a) Re(s—-a)? o, 
ог 
Фе") = Е(ѕ – а), Re(s) > o, Re(a) 


end of theorem 


An alternative way of expressing the result of Theorem 5.2, which may be found 
more convenient in application, is 


е0} = SOs- = F) ss-a 


In other words, the theorem says that the Laplace transform of e" times a function f(t) 
is equal to the Laplace transform of f(t) itself, with s replaced by s — a. 


Determine Zíte?^. 


From the result given in (5.6), 


£up-FG)-lL, Re(s)»0 
iS 
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so, by the first shift theorem, 
Pit e”) = F(s + 2) = [F] Re(s) > 0-2 


that is, 





Siue ec Re(s) > -2 
(s+ 


oy 


[2] This is readily dealt with using MATLAB or MAPLE. The commands 


MATLAB MAPLE 
syms s t with(inttrans): 
Шаа еа ее) паршасе( арР а) 75) 


pretty (ans) 
return the transform as 


p _ 
ЗИ) 


Example 5.10 . Determine Ze sin 27}. 


Solution From the result (5.8), 


2 


s? +4 





L{ sin 2t} = F(s) = Re(s) > 0 


so, by the first shift theorem, 
е! віп 21) = F(s + 3) = [Е(5)], „з, Ве(ѕ) > 0 – 3 
that is, 


2 _ 2 


Le" sin 2t} = ——— = — 
(5+3) +4 5 +65+ 13 


, Re(s)  -3 


(m) In MATLAB the commands: 


syms s t 
laplace(exp(-3*t)*sin(2*t)) 


return 
en cM ARCEM EON 
Entering the further commands 


simple (ans); 
pretty (ans) 
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Theorem 5.3 


returns 
Оу (г ду ME EIE 


as an alternative form of the answer. Note that the last two commands could be 
replaced by the single command pretty (simple (ans)). 
In MAPLE the commands 


with(inttrans): 
laplace(exp(-3*t)*sin(2*t),t,s); 


return the answer 


| 


2 2 
ЕЯ 


There is no simple command in MAPLE. 


The function e^'sin 2 in Example 5.10 is a member of a general class of func- 
tions called damped sinusoids. These play an important role in the study of engi- 
neering systems, particularly in the analysis of vibrations. For this reason, we add 
the following two general members of the class to our standard library of Laplace 
transform pairs: 


a 


Fe" sinat} = ———,_ Re(s) > -k (5.10) 
(s+k) +a 

Le" cos at} = —St# —, Re(s) > -k (5.11) 
(s+k) +a 


where in both cases Kk and a are real constants. 


Property 5.3: Derivative-of-transform property 


This property relates operations in the time domain to those in the transformed s 
domain, but initially we shall simply look upon it as a method of increasing our 
repertoire of Laplace transform pairs. The property is also sometimes referred to as the 
multiplication-by-¢ property. A statement of the property is contained in the following 
theorem. 


Derivative of transform 


If f(f) 1s a function having Laplace transform 
F(s) = HSO}, Re(s)- о, 


then the functions ¢” f(t) (n = 1, 2,...) also have Laplace transforms, given by 


Lit fly} = a Re(s) > 0. 
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Proof By definition, 


oo 


Li flt)} = F(s) = | e" fir) dr 


0 


so that 
dr(s d | e"fod 
ds” ds" J, 


Owing to the convergence properties of the improper integral involved, we can inter- 
change the operations of differentiation and integration and differentiate with respect to 
s under the integral sign. Thus 


— | 2^ [e"fu))] d 
ds o 95 


which, on carrying out the repeated differentiation, gives 


zJ = e | e"r' fur) dr - (C1'Zü^fi), Кеб) > о, 


the region of convergence remaining unchanged. 


end of theorem 


In other words, Theorem 5.3 says that differentiating the transform of a function 
with respect to s is equivalent to multiplying the function itself by —t. As with the pre- 
vious properties, we can now use this result to add to our list of Laplace transform pairs. 


Example 5.11 Determine {7 ѕіп 32). 


Solution Using the result (5.8), 
3 


s +9” 





L{ sin 3t} = F(s) = Re(s) > 0 


so, by the derivative theorem, 


{тзп 3г} = - ЖО) „65 Re(s) > 0 
ds (52 +9) 


(B In MATLAB the commands 


syms s t 
laplace(t*sin(3*t)) 


return 
ams = i/(s°2 «a 9) eiEem (3/5) ) 
Applying the further command 


simple (ans) 
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returns 
ams MASA NC MODA RB + S/s*2) 
Repeating the simple command 
simple (ans) 
returns the answer in the more desirable form 
eus — G"s/(mS^2 4 9)^2 
In MAPLE the commands 


with(intrans): 
tirare etos tS 


return the transform as 


С i 
sin(2 ane tan(3 1)) 


2 
5 +9 


and there appears to be no command to simplify this. 


Example 5.12 . Determine Z(t? e^. 
Solution From the result (5.7), 
LE} = F(s) = a Re(s) > 1 
en 


so, by the derivative theorem, 


wees = EO = (+) 
ds 


ds?’ \s- 1 
d 1 
=(-1) = 
i s 


Re(s) > 1 





3° 


~ (9-1) 


Note that the result is easier to deduce using the first shift theorem. 


(m Using MATLAB or MAPLE confirm that the answer may be checked using the 
following commands: 


MATLAB MAPLE 
syms s t with(inttrans): 
laplace(t^2*exp(t)) Шарасе (Б^Л *еҗә ICON. 


Example 5.13 


Solution 


5.2.5 


Figure 5.5 

(a) Table of Laplace 
transform pairs; 

(b) some properties of 
the Laplace transform. 
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Determine Zt"), where n is a positive integer. 


Using the result (5.5), 
Sp)-i, Re()»0 
s 


so, by the derivative theorem, 





Sn = 1" (2) = nL o Re(s)>0 


ds” S se f 


Table of Laplace transforms 


It is appropriate at this stage to draw together the results proved to date for easy access. 
This 1s done in the form of two short tables. Figure 5.5(a) lists some Laplace transform 
pairs and Figure 5.5(b) lists the properties already considered. 


























(a) . 

Л) LINA} = F(s) Region of convergence 

c, c a constant £ Re(s) > 0 
s 

t l Re(s) 7 0 
s 

! 

t^, n a positive integer т Re(s) > 0 

e", k a constant + Re(s) > Re(k) 
ei 

sin at, a a real constant 2 а с Re(s) > 0 
sta 

COS af, a a real constant 3 x 5 Re(s) > 0 
sta 

e™ sin at, k and a real constants £ 5 Re(s) > -k 
(stk) +a 

kt s+k 

e * cos at, k and a real constants ey E Re(s) > -k 

(stk) +a 
(b) 
AL{ f()} =F(s), Re(s)? o, and L{g(t)}=G(s), Rel) > о, 
Linearity: Li aft) + Be(t)} = aF(s) + BG(s), Re(s) 2 max(o,, 6;) 
First shift theorem: Life" f(t)} =F(s—a), Re(s) > 6, + Re(a) 


Derivative of transform: 


Lit" f()} = a (n= 1,2,...), Re(s) > о, 
15 
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5.2.6 Exercises 


1 Use the definition of the Laplace transform 3 
to obtain the transforms of f(4) when f(A) is 


Using the results shown in Figure 5.5, obtain the 
Laplace transforms of the following functions, 


given by stating the region of convergence: 
(a) cosh2t (Ы) P? (с) 3-t — (d) te* (а) 5-31 (Б) 77? —2 sin3t 
stating the region of convergence in each case. (с) 3 – 21+ 4с0521 (d) cosh 3¢ 

2  Whatare the abscissae of convergence for the Rd D M 
following functions? (g) 4te* (Һ) 2е7°* ѕіп 27 
(а) е? (b) e? (i) te* (G) 6-3 +4t-2 
(c) sin2t (d) sinh 3t (К) 2с05 37+ 5 510 31 (1) tcos2t 
(е) cosh 2t (Е) г“ (т) #2 ѕіп 3/ (п) £? 3 соѕ47 
(g) е +22 (h) 3 соѕ 27-Р (о) 12е + е соѕ27+ 3 


(1) 3е2— 262+ 5іп27 


(j) sinh 3¢ + sin 3¢ Check your answers using MATLAB or MAPLE. 


5.2.7 The inverse transform 


Figure 5.6 
The Laplace transform 
and its inverse. 


The symbol Z^! ( F(s)) denotes a causal function f(f) whose Laplace transform is F(s); 
that is, 


if HSO} =F) 


This correspondence between the functions F(s) and f(t) is called the inverse 
Laplace transformation, f(t) being the inverse transform of F(s), and Z~ being 
referred to as the inverse Laplace transform operator. These relationships are depicted 
in Figure 5.6. 


then fA = L EF(s)} 


LA} 


S 


As was pointed out in observation (c) of Section 5.2.1, the Laplace transform F(s) 
only determines the behaviour of f(A for t => 0. Thus £7'{F(s)} = f(A) only for t 7 0. 
When writing Z^! (F(s)) — f(t), it is assumed that ¢ = 0 so strictly speaking, we should 
write 


Z^ UG) = SOHO (5.12) 


Example 5.14 


Example 5.15 


5.2.8 


E] 
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Since 


at 1 
о 


if follows that 


Since 


0 


2 


L{sin ot} = s 
S tO 





it follows that 


gt L2 = sin wt 
Ss +@ 





The linearity property for the Laplace transform (Property 5.1) states that if o and 8 
are any constants then 


LH AfA + BO; = aH SO + BPH} = aF) + BG(s) 
It then follows from the above definition that 
L'{ OF(s) + BG(s)} = af(t) + Bg(t) = aL" {F(s)} + BL"{G(s)} 


so that the inverse Laplace transform operator £~ is also a linear operator. 


Evaluation of inverse transforms 


The most obvious way of finding the inverse transform of the function F(s) is to make 
use of a table of transforms such as that given in Figure 5.5. Sometimes it is possible 
to write down the inverse transform directly from the table, but more often than not 
it is first necessary to carry out some algebraic manipulation on F(s). In particular, we 
frequently need to determine the inverse transform of a rational function of the form 
P(s)/q(s), where p(s) and q(s) are polynomials in s. In such cases the procedure is first 
to resolve the function into partial fractions and then to use the table of transforms. 


Using MATLAB Symbolic Math Toolbox the commands 


Sam cM 
ilaplace(F(s)) 


return the inverse transform of F(s). The corresponding MAPLE commands are 


with(inttrans): 
invlaplace(F(s),s,t); 
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Example 5.16 Find 


zi 1 
ы less - = 


Solution — First 1/(s + 3)(s — 2) is resolved into partial fractions, giving 
1 zl 


e есы 
(5+3)(5-2) з+3 5-2 


Then, using the result 271{1/(5 + а)} = е_“' {ореїһег with the linearity property, we 


һауе 
С 1 -1} 1 -} 1 -3t 2t 
ара И m 1LF: — += -te le 
5) a ‘staf! f] ? ш: 


(m Using MATLAB or MAPLE the commands 


MATLAB MAPLE 

syms s t with(inttrans): 
аар е 1/7 (Cus s 3S (ue c 2998 Mopac (17 (Е зь 3) 
pretty (ans) (ON DD с р); 


return the anwers 


-iexp(-3t) * 1/5exp(2t) -že 5 ze 


Example 5.17 Find 
9-1) 5+ I | 
[25 +9) 


Solution Resolving (s + 1)/s*(s* + 9) into partial fractions gives 





+1 5 
ae = 2+ 
s(s +9) s 


AL por 


Using the results in Figure 5.5, together with the linearity property, we have 


g s+l |- 1i4lj-lcos3t-4sin3t 
52052 + 9) 9 9 9 27 
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(a Using MATLAB or MAPLE check that the answer can be verified using the 
following commands: 


MATLAB MAPLE 

syms s t with(inttrans): 
dupl aee PA сано invlaplace((s + 1)/ 
pretty (ans) (E^2w(9^2 4 ЮЕ) 


5.2.9 Inversion using the first shift theorem 


In Theorem 5.2 we saw that if F(s) is the Laplace transform of f(t) then, for a scalar a, 
F(s — a) is the Laplace transform of e"f(r). This theorem normally causes little diffi- 
culty when used to obtain the Laplace transforms of functions, but it does frequently 
lead to problems when used to obtain inverse transforms. Expressed in the inverse form, 
the theorem becomes 


S^ (Fs — a)) — e"f(t) 
The notation 


Z^ FG = е“ [Л] 


where (s) 2 Vf(t)) and [F(s)], ,,., denotes that s in F(s) is replaced by s — a, may 
make the relation clearer. 


Example 5.18 . Find 


eiu] 
(5 +2)" 


Solution 1 | 
(s +2) s? 55-2 


апа, ѕіпсе 1/52 = {7}, the shift theorem gives 
£1 I | - te? 
(s--2) 


[=] Check the answer using MATLAB or MAPLE. 
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Example 5.19 Find 


. 2 
gud t 
t m 


Solution 2 _ 2 a| 2 l 
s—>s+3 





$46s-13 (s-3)44 |g42 


and, since 2/(s? - 2?) 2 (sin 21), the shift theorem gives 
= 2 - 
dp mmm s 87 gin 2t 
s +6s+ 13 


(m The MATLAB commands 


syms s t 
ilaplace(2/(s^2 + 6*s + 13); 
pretty (simple(ans) ) 


return 
ans = -1/2i(exp((-3 + 2i)t) - exp((-3 - 2i)t)) 
The MAPLE commands 


with(inttrans) : 
дм аас е оу (сок кено Єл NS 
Symeon NT 


return the same answer. 
To obtain the same format as provided in the solution further manipulation is 
required as follows: 


1/2i[- e "e?" 4 e 3e ?"] 2 e?'( (e? -— e™)/(2i)) = e™ sin 2t 


Example 5.20 Find 


-1 59-7 
L 2 
l- sl 


Solution ЕТ gUETS V 
s+2s+5 (5+1)? +4 
_ (5+1) +3 2 


o (st1)«4 (5+1) +4 


s 2 
- +3 
> + J s—>s+l > + = ssl 
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Since s/(s* + 2”) = L{cos 2t} and 2/(s* + 2”) = "(sin 2t), the shift theorem gives 


= + 2 E 
UE fatl =e cos 2436" sin 2s 


s’ +2s+5 


Example 5.21 Find 


F 1 | 
(s - 1Y (s? 4 4) 


Solution Resolving 1/(s + 1)°(s? + 4) into partial fractions gives 


2 


1 25 


(5+1)(5?+4) 5+1 


2 
25 
stl 





GFI 2 52+4 


_2_5___з3_2 
25 .2 2 50 2 2 
"E s +2 s +2 


Since 1/s* = L{t}, the shift theorem, together with the results in Figure 5.5, gives 


ey 1 
= em 


t l ext 2 Э) = 
+56 1- = с05 21- 510 27 


[=] Check the answers to Examples 5.20 and 5.21 using MATLAB or MAPLE. 


5.2.10 Exercise 


Check your answers using MATLAB or MAPLE. 


Find Z^ (F(s)) when F(s) is given by 





I 5 +5 
@) 33) 7) ( Gc *D6-3 
E = 1 (d) 2 +6 
5(5= +3) s +4 
) — (f) its 
s'(s +16) 5 +45 +5 
(g) — h) — e 
s (s +45 +8) (s-1)(s +1) 
(i) - 5+7 
5 +25 +5 


I 352- 75 +5 55-7 
О) (1000-20063) O GHE 
(1) —_—— (m) uL. 

($= 1)(s +25 +2) 5 +25 +5 
s-1 3s 
Оа У С) 


36 
) 2 2 
s(s +1)(5 +9) 
252 +45 +9 
(= +2)(5 +35 +3) 


1 
(5 +1)(5 +2)(5° +25 +10) 
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Solution of differential equations 


азд 


We first consider the Laplace transforms of derivatives and integrals, and then apply 
these to the solution of differential equations. 


Transforms of derivatives 


If we are to use Laplace transform methods to solve differential equations, we need 
to find convenient expressions for the Laplace transforms of derivatives such as df/dt, 
d"f/dt? or, in general, d"f/dt". By definition, 


а | ony, 
dt , at 
Integrating by parts, we have 


оо 


elg - [e"fto]o s | e"f(t)dt= —f(0) + sF(s) 


0 


that is, 


sla = sF(s) — f(0) (5.13) 
In taking the Laplace transform of a derivative we have assumed that f(t) is continuous 
at t= 0, so that f(0-) = f(0) 2 f(0*). In Section 5.5.8, when considering the impulse 
function, f(0-) # f(0*) and we have to revert to a more generalized calculus to resolve 
the problem. 

The advantage of using the Laplace transform when dealing with differential equations 
can readily be seen, since it enables us to replace the operation of differentiation in the 
time domain by a simple algebraic operation in the s domain. 

Note that to deduce the result (5.13), we have assumed that f(f) is continuous, with 
a piecewise-continuous derivative df/dt, for t = 0 and that it is also of exponential order 
as f — oo. 

Likewise, 1f both f(t) and df/dt are continuous on t 7 0 and are of exponential order 
as f — co, and d?f/d£? is piecewise-continuous for t 7 0, then 


gld- e" Sg, - e" Y +, в" а = |0] Lv uid 
dé 0 dé d 0 0 dt dt 10 dt 
which, on using (5.12), gives 
dfi. |df й 
“| in- |] estero) AO) 


leading to the result 


se = s F(s) - sf(0) - Е = 5 Е(5) – sft0) - f (0) (5.14) 
t t=0 


5.5.2 
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Clearly, provided that f(f) and its derivatives satisfy the required conditions, this pro- 
cedure may be extended to obtain the Laplace transform of f (f) 2 d"f/dt" in the form 
AFOD = ets) - s/(0) — 5600) — ...— fet) 
-s'F(s)- V s O (5.15) 


і=1 


a result that may be readily proved by induction. 
Again it is noted that in determining the Laplace transform of f"(r) we have 
assumed that f" (f) is continuous. 


Transforms of integrals 


In some applications the behaviour of a system may be represented by an integro- 
differential equation, which is an equation containing both derivatives and integrals 
of the unknown variable. For example, the current i in a series electrical circuit con- 
sisting of a resistance R, an inductance L and capacitance C, and subject to an applied 
voltage E, is given by 


& „1[. _ 
LB enel | inar- s 


To solve such equations directly, it is convenient to be able to obtain the Laplace 
transform of integrals such as fo f(T) dT. 
Writing 


g(t) -| f(t) dc 

we have 
dg _ i 
di =f(t), g(0)=0 


Taking Laplace transforms, 


dg | _ 
| de} = дд} 


which, on using (5.13), gives 
sG(s) = F(s) 
or 


g(t) } = G(s) = 1Р0) - LSU) 


leading to the result 


S | | Дт) ar! = LASO Е LEG) (5.16) 
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Example 5.22 


5.3.3 


Obtain 


zu (7 4 sin 27) а) 


In this case f(f) ^ t? ^ sin 2f, giving 


F(s) = Zt f(t)) 2 Zt?) * Z(sin 21 





6 2 
= = 1 — 
5 5+4 
so, by (5.16), 
zi (^ sn20 de] - Ero - &+ 2 
0 5 s. s(s t4) 


Ordinary differential equations 


Having obtained expressions for the Laplace transforms of derivatives, we are now in 
a position to use Laplace transform methods to solve ordinary linear differential equations 
with constant coefficients. To illustrate this, consider the general second-order linear 
differential equation 

atts pHs oy = u(t) (t= 0) (5.17) 

d^ ш 
subject to the initial conditions x(0) 2 x;, x(0) 2 v, where as usual a dot denotes differ- 
entiation with respect to time, t. Such a differential equation may model the dynamics 
of some system for which the variable x(t) determines the response of the system to the 
forcing or excitation term u(t). The terms system input and system output are also 
frequently used for u(t) and x(t) respectively. Since the differential equation is linear 
and has constant coefficients, a system characterized by such a model is said to be a 
linear time-invariant system. 
Taking Laplace transforms of each term in (5.17) gives 


IR pdl, сх) = Llult)} 
аг dr 
which on using (5.13) and (5.14) leads to 
a[s?X(s) — sx(0) — x(0)] + b[sX(s) — x(0)] + cX(s) = U(s) 
Rearranging, and incorporating the given initial conditions, gives 
(as? + bs + c)X(s) = U(s) + (as + b)xy + avy 
so that 


X(s) = U(s) + (as + b)xo + аш, (5.18) 


as +bs+c 


Example 5.23 


Solution 
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Equation (5.18) determines the Laplace transform X(s) of the response, from which, by 
taking the inverse transform, the desired time response x(f) may be obtained. 


Before considering specific examples, there are a few observations worth noting at 


this stage. 


(a) 


(b) 


(c) 


(d) 


As we have already noted in Section 5.3.1, a distinct advantage of using the 
Laplace transform is that it enables us to replace the operation of differentiation 
by an algebraic operation. Consequently, by taking the Laplace transform of each 
term in a differential equation, it is converted into an algebraic equation in the 
variable s. This may then be rearranged using algebraic rules to obtain an expres- 
sion for the Laplace transform of the response; the desired time response is then 
obtained by taking the inverse transform. 


The Laplace transform method yields the complete solution to the linear differ- 
ential equation, with the initial conditions automatically included. This contrasts 
with the classical approach, in which the general solution consists of two compo- 
nents, the complementary function and the particular integral, with the initial 
conditions determining the undetermined constants associated with the comple- 
mentary function. When the solution is expressed in the general form (5.18), upon 
inversion the term involving U(s) leads to a particular integral while that involv- 
ing xy and v, gives a complementary function. A useful side issue is that an 
explicit solution for the transient is obtained that reflects the initial conditions. 


The Laplace transform method 1s ideally suited for solving initial-value prob- 
lems; that is, linear differential equations in which all the initial conditions 
x(0), x(0), and so on, at time ¢ = 0 are specified. The method is less attractive for 
boundary-value problems, when the conditions on x(f) and its derivatives are not 
all specified at t = 0, but some are specified at other values of the independent 
variable. It is still possible, however, to use the Laplace transform method by 
assigning arbitrary constants to one or more of the initial conditions and then 
determining their values using the given boundary conditions. 


It should be noted that the denominator of the right-hand side of (5.18) is the left- 
hand side of (5.17) with the operator d/dt replaced by s. The denominator equated 
to zero also corresponds to the auxiliary equation or characteristic equation used 
in the classical approach. Given a specific initial-value problem, the process of 
obtaining a solution using Laplace transform methods is fairly straightforward, 
and is illustrated by Example 5.23. 


Solve the differential equation 


2 
Or 45S c 6s og et (t = 0) 
dt dt 


subject to the initial conditions x = 1 and dx/dt = 0 at t = 0. 


Taking Laplace transforms 


PR s [a 6Z[x) 224[e^ 
ағ dt 
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leads to the transformed equation 
[s2X(s) — sx(0) — X(0)] + 5[sX(s) — x(0)] + 6X(s) = — 

which on rearrangement gives 
6? + 5s + Дв) = —#— + (ь + 5)х(0) + (0) 

Incorporating the given initial conditions x(0) = 1 and x(0) = 0 leads to 
(52 + 55 + 6)Х() = Ies 


That is, 


"— — 0 
(+ 1)(5+2)(5+3) (5+3)(5+2) 


Resolving the rational terms into partial fractions gives 


X(s) 


оса od 
stl s+2 s+3 5+2 5+3 





gehn ub. 1_ 
stl s+2 s+3 
Taking inverse transforms gives the desired solution 


x()=e'+e7%-e* (t= 0) 


In principle the procedure adopted in Example 5.23 for solving a second-order linear 
differential equation with constant coefficients is readily carried over to higher-order 
differential equations. A general nth-order linear differential equation may be written as 


= 
a, C «a, TX, tage = ule) (t = 0) (5.19) 
de’ а” 
where aà,, 4,4, ... , ag are constants, with a, # 0. This may be written in the more 
concise form 
q(D)x(t) — u(t) (5.20) 


where D denotes the operator d/dt and q(D) is the polynomial 


n 


400) = У ар’ 


г=0 


The objective is then to determine the response x(f) for a given forcing function u(t) 
subject to the given set of initial conditions 


ox = e =c, (r=0,1,...,n-1) 
t-0 


Taking Laplace transforms in (5.20) and proceeding as before leads to 


Example 5.24 


Solution 
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X(s) = 20) 
q(s) 


where 


p(s) = U(s) + 5 с, Y is 
r=0 


і=г+1 
Then, in principle, by taking the inverse transform, the desired response x(t) may be 
obtained as 


арб) 
ИО (o 


For high-order differential equations the process of performing this inversion may 

prove to be rather tedious, and matrix methods may be used as indicated in Section 5.7. 
To conclude this section, further worked examples are developed in order to help 

consolidate understanding of this method for solving linear differential equations. 


Solve the differential equation 
2 
ах 69,9; - sinf (7 2 0) 
dé dt 


subject to the initial conditions x = 0 and dx/d 2 0 at t — 0. 


Taking the Laplace transforms 


se. sid. 9S [x) — [sint] 


leads to the equation 


[s2X(s) — sx(0) — x(0)] -- 6[sX(s) — x(0)] + 9X(s) = 





2 
S ct 
which on rearrangement gives 


(52 + 65 + 9)Х(5) = 





—1— + (з + 6)х(0) + Х(0) 

s +l 

Incorporating the given initial conditions x(0) = x(0) = 0 leads to 

X(s) = St 
(s^ 1s 3) 


Resolving into partial fractions gives 
1 1 1 
X()2i——-4L -+2 ua 
5+3 (5+3) 








25 
gad 


7 50 
Ku 


that is, 


1 1 1 5 
X(s) = 2— +4+]/= +2 Ea: 
0 + 3 10 s iat 2 + 1 50 s? $ 1 
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Taking inverse transforms, using the shift theorem, leads to the desired solution 


3:31 1 -3t 2 ui 3 
x(t) = GE ute -5sinft-icost (2 0) 


[=] In MATLAB, using the Symbolic Math Toolbox, the command dsolve computes 
symbolic solutions to differential equations. The letter D denotes differentiation 
whilst the symbols D2, D3,..., DN denote the 2", 3", ..., N” derivatives respec- 
tively. The dependent variable is that preceded by D whilst the default independent 
variable is t. The independent variable can be changed from t to another symbolic 
variable by including that variable as the last input variable. The initial conditions 
are specified by additional equations, such as Dx(0) = 6. If the initial conditions 
are not specified the solution will contain constants of integration such as C1 and C2. 

For the differential equation of Example 5.24 the MATLAB commands 


syms x t 
Есе РЗОК 
@ 


pretty (simple (x)) 

return the solution 
x = -3/50cos(t) + 2/25sin(t) + 3/50(1/exp(t)?) 

* 1/10(t/exp(t)?) 
It is left as an exercise to express 1/exp(t)' ase". 
In MAPLE the command dsolve is also used and the commands 

Осе fd (s od M Le o NE S f d d rm Ce) MO c o E rb ra e 
deolme Eo cc ОО (ЕКО = WO}, S Cte M ez 

return the solution 


ECCE ple ese ы eae 
ee) = 27е + 108 20605 (0) + 558in(t) 


If the initial conditions were not specified then the command 
dsolve({ode2}, x(t)); 


returns the solution 


E (-3t) EE 2b ous 
se) е _C1 + e CERES 27С05 (9) + 5g8in(t) 
In MAPLE it is also possible to specify solution by the Laplace method and the 
command 
асое oem. sq) - OW, iW) 9) =| b. s) 
method - laplace); 
also returns the solution 
DENM LM pm 
xe) = 50805 (9) + 258in(t) + 50 (5t + 3) 


and, when initial conditions are not specified, the command 
dsolve({ode2},x(t), method = laplace); 


returns the solution 
mcs EN EN 
= 20605 (9) + 5в51п (6) + T ETE CIO ERI 


+ 150 t x(0) + 5t + 50x(0) + 3) 
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Example 5.25 Solve the differential equation 
dx (sd ру 135-1 (rz 0j 
dt аг dt 
subject to the initial conditions x = dx/dt = 1 and d’x/dt? = 0 at t= 0. 


Solution Taking Laplace transforms 
gis 46 seld Ds АН +13L{x} = #11} 
а? а? di 
leads to the equation 
s3X(s) — sx(0) — sx(0) — ¥(0) + 5[s2X(s) — sx(0) — 3(0)] 


4 17[sX(5) — x(0)] 135) = 1 
s 
which on rearrangement gives 
(5? + 55° + 17s + 13)X(s) = = + (52 + 55 + 17)х(0) + (5 + 5)(0) + x(0) 


Incorporating the given initial conditions x(0) 2 x (0) 2 1 and x(0) — 0 leads to 


3 2 
Х(з) = x t os +22s+1 
5(5 +55 + 175+ 13) 


Clearly s + 1 is a factor of s° + 5s? + 17s + 13, and by algebraic division we have 


3 2 
X(s)- s +6s +225+1 
s(s+1)(s° + 45+ 13) 


Resolving into partial fractions, 
1 8 1 


Ху Ва 448+7 5 _ 1 448 +2) - 273 


$ o sRÀ Ës +4s+13 s s+1 Č (s+2}+3? 


Taking inverse transforms, using the shift theorem, leads to the solution 


х) = 5+ е '-ie (44 cos 3t-27sin3t) (t2 0) 


(m) Confirm that the answer may be checked using the commands 


Evans. Pe E 
pace eb c О Е ОСЕ NISI MIN ЕСО Е 
Ds (0) = O%)) z 
pretty (simple (x) ) 
in MATLAB, or the commands 
ОСЕ АЕО ЕСЕ 5а E и yp 1292) 
КОЛ АДЕ (бе S =; 
Обл odes, (0) = їз (ШШ) = 2 e JE CODE MEO P 


x(t),method - laplace); 
in MAPLE. 
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5.3.4 


Example 5.26 


Solution 


Simultaneous differential equations 


In engineering we frequently encounter systems whose characteristics are modelled 
by a set of simultaneous linear differential equations with constant coefficients. The 
method of solution is essentially the same as that adopted in Section 5.3.3 for solving 
a single differential equation in one unknown. Taking Laplace transforms throughout, 
the system of simultaneous differential equations is transformed into a system of 
simultaneous algebraic equations, which are then solved for the transformed variables; 
inverse transforms then give the desired solutions. 


Solve for t 7 0 the simultaneous first-order differential equations 


dx | dy Le 

a oE (5.21) 
de dy aye 

"Auto rds es (5.22) 


subject to the initial conditions x — 2 and y 7 1 at t — 0. 


Taking Laplace transforms in (5.21) and (5.22) gives 
sX(s) — x(0) + s¥(s) — y(0) 4- 5X(s) - 3Y(s) - 4 
2[sX(s) — x(0)] - sY(s) — y(0) + X(s) + (5) = : 


Rearranging and incorporating the given initial conditions x(0) = 2 and y(0) 2 1 
leads to 





yl _3s+4 
(s 5)X) - @ + 31) 2 30 —À1- srl (5.23) 
(25 + 1)Х(5) + (5 + 1)У(5) = 5+ 2 - uii (5.24) 


Hence, by taking Laplace transforms, the pair of simultaneous differential equations 
(5.21) and (5.22) in x(t) and y(t) has been transformed into a pair of simultaneous 
algebraic equations (5.23) and (5.24) in the transformed variables X(s) and Y(s). 
These algebraic equations may now be solved simultaneously for X(s) and Y(s) using 
standard algebraic techniques. 

Solving first for X(s) gives 


| 25^ 4 1454-9 


A Rec] 


Resolving into partial fractions, 


9 1 25 


Xs) =—Ż - Rope: 
(s) s +2 58-1 
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which on inversion gives 
х() =-2- Че” + е (г> 0) (5.25) 
Likewise, solving for Y(s) gives 
3 2 
Tis) - T oa 
Resolving into partial fractions, 


151 u 25 
Y(s) = 4 + — + — - — 
s stl s+2 s-1 











which on inversion gives 
= -2 
x022*28'426'-26 (20) 


Thus the solution to the given pair of simultaneous differential equations 1s 


219. cd -2t 25 d 
X()e-i--ce $68 | jn 


yA) = 2+16'+ е” - е 


Note: When solving a pair of first-order simultaneous differential equations such as 
(5.21) and (5.22), an alternative approach to obtaining the value of y(t) having obtained 
x(f) is to use (5.21) and (5.22) directly. 

Eliminating dy/dt from (5.21) and (5.22) gives 


2y- 9 —4x-3+e7 
Substituting the solution obtained in (5.25) for x(¢) gives 

2у= (Зе +2 е) – 4(-2 - Пе” + )е-з+е" 
leading as before to the solution 

у= 2+1е'+0е”- е 
A further alternative is to express (5.23) and (5.24) in matrix form and solve for X(s) 
and Y(s) using Gaussian elimination. 


In MATLAB the solution to the pair of simultaneous differential equations of 
Example 5.26 may be obtained using the commands 

SAT Nt 

Ema eros DNE NIVELES ЭУЕ сЕ) 

Lo оюу тож у= з, 

“s2(0) = 2,57(0) = 1”) 
which return 


pM Mi detener e pure ME MT IE AC resp (ley) eee D 
W = =25/2°ero(ic) ер ЕО иер) 
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These can then be expressed in typeset form using the commands pretty (x) and 
pretty (y). In MAPLE the commands 


PONE MATE ы 
(QE) чк УЕ} = 
) 


ООС (Е E 
х 7 
= о (Е) а Е) р) 8 


Ое MES c rte M MNT esa MM Cie E 








асое ОЧЕ оО УО 
return 
ERE 2o R кс S E 
Е) = се 5 о, ore 5 p } 


In principle, the same procedure as used in Example 5.26 can be employed to solve a 
pair of higher-order simultaneous differential equations or a larger system of differen- 
tial equations involving more unknowns. However, the algebra involved can become 
quite complicated, and matrix methods, considered in Section 5.7, are usually preferred. 


5.3.5 Exercises 


Check your answers using MATLAB or MAPLE. 


Using Laplace transform methods, solve for t => 0 (g) 


dx 


cyte ares" sin f 
d 


the following differential equations, subject to the di f 


specified initial conditions: 


dx ou 
(а) 20 +3х=6 


subject tox 2 2at t2 0 


(b) 305. 4 gin or 


dr 
subject to x=} atr=0 
(c) Ча 2% +зх=1 
dà dr 


subject to x 2 0 and wzo att=0 


(d) d'y +2® +у =4соз 2t 
t 


аг 

subject to y = 0 and v-a att=0 
(e) бхз 26“ 

dt dt 

subject to x = 0 and 9 =1 at t=0 
(f) dx +4®+5х=зе” 

dt dt 

subject to x = 4 and Ф __7 att=0 


dt 


(h) 


(0) 


Q) 


(k) 


D 


subject to x = | and wzo att=0 


dy 4 43) = 34 
P d 


d 


subject to y = 0 and xl at t=0 
d 44% 44, = à 6% 
dr dr 
subject to x 2 and = att=0 

ох ро tsx 

а dt 
subject to x 2 0 and n att-0 
х в рубу 16 sin 4; 
dt а 

--1 and ®— = 
subject to x 2 -5 and E^ t att=0 


9 dy +12 dy +4у=е' 
df t 


subject to y = 1 and 9-1 att-0 
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3 
(m) $2 2d d uo, us 
dt dr t 
2 
о and 2X =0 0 att=0 
dt аг 
(п) 9 TU x pdx pdr +x = cos 3f 
de dt dt 
subject х= 0, = 1 and SF = 1 ate=0 
t t 


Using Laplace transform methods, solve for t = 0 
the following simultaneous differential equations 
subject to the given initial conditions: 


(a) 29: - 389 -9- г 


dx 49 
2а 


subject to x = 0 and y - i att=0 


7 
Du di 


dx dy t 
2 = +3 = +х-у=е 
dt d 


+4х-37у=0 


+x-y=5sint 


subject tox = 0 andy =0 atr=0 


dx | dy 


GC a d 


+2х +у = 

ЧУ 5х +3у= 56” 

dt 

subject tox =—1 and y=4 att=0 


de 5 dy _ 


d ш оС 


(d) 3 


di ode. 
d um pe 


subject tox = 1 andy=1 att=0 


dx d 


(e) PISO 2x =3 sint +5 cost 
29: +07 ну = sin + соз! 


subject to x 2 0 and y 2 -1att- 0 


dx dy 
f HE .,- 
(f) — n |; y-t 


de 44 dy 
ЕЯ +х=1 


subject to x 2 1 and y 2 0att- 0 


(в) 29 +39 +7х= 14t +7 
dx _ 3 dy 


^d dt 


+4х +бу = 14t - 14 


subject tox=y=0att=0 
(b) diya 2x бу-у 
аг dr 


subject to x = 4, y = 2, dx/dt = 0 and dy/dt = 0 
att=0 


(i) 5х + У +6х= 0 
dà dà 


2 
sde +1692 +бу=0 
dt dt 

subject to x = 1 , y = 1, dx/dt = 0 and dy/dt = 0 
att=0 

dx dy dk dv. 
d? d? d: d 


jd - dy , dr dy. 5у- 7х 
^ d? d? df d 


(j) 2 -3y-9x 





subject to x = dx/dt= 1 and y= dy/dt=0 at t=0 


5.4 Engineering applications: avid) сігсиіїѕ апа 


mechanical vibrations 


To illustrate the use of Laplace transforms, we consider here their application to the 
analysis of electrical circuits and vibrating mechanical systems. Since initial con- 
ditions are automatically taken into account in the transformation process, the Laplace 
transform is particularly attractive for examining the transient behaviour of such 


systems. 
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5.4.1 


Figure 5.7 
Constituent elements 
of an electrical circuit. 


Example 5.27 


Using the commands introduced in previous sections MATLAB or MAPLE can be 
used throughout this section to check answers obtained. 


Electrical circuits 


Passive electrical circuits are constructed of three basic elements: resistors (having 
resistance R, measured in ohms Q), capacitors (having capacitance C, measured in 
farads F) and inductors (having inductance L, measured in henries H), with the asso- 
ciated variables being current i(t) (measured in amperes A) and voltage v(t) (measured 
in volts V). The current flow in the circuit is related to the charge q(t) (measured in 
coulombs C) by the relationship 


T 
dt 
Conventionally, the basic elements are represented symbolically as in Figure 5.7. 
C 
+ R + Gas -4 + L - 
o— —[L ——1——9o o— —] —— 
i()— >» i(t))—» i(t) —- 
(a) Resistor (b) Capacitor (c) Inductor 


The relationship between the flow of current i(t) and the voltage drops v(t) across 
these elements at time f are 


voltage drop across resistor = Ri (Ohm’s law) 


voltage drop across capacitor = S ji @ = 2 


The interaction between the individual elements making up an electrical circuit is deter- 
mined by Kirchhoff’s laws: 
Law 1 


The algebraic sum of all the currents entering any junction (or node) of a circuit is zero. 


Law 2 


The algebraic sum of the voltage drops around any closed loop (or path) in a circuit is zero. 


Use of these laws leads to circuit equations, which may then be analysed using Laplace 
transform techniques. 


The LCR circuit of Figure 5.8 consists of a resistor R, a capacitor C and an inductor L 
connected in series together with a voltage source e(t). Prior to closing the switch at 
time ѓ = 0, both the charge on the capacitor and the resulting current in the circuit are 
zero. Determine the charge q(t) on the capacitor and the resulting current i(t) in the 
circuit at time ¢ given that R= 160 Q, L=1H, C= 10* F and e(t) = 20 V. 
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Figure 5.8 
LCR circuit of 
Example 5.27. 


Solution 





Applying Kirchhoff's second law to the circuit of Figure 5.8 gives 


M 
nisi ed [aoo (5.26) 


or, using i — dq/dt, 
2 
гЧ4 +89 +1.=е0) 
dt dt C 
Substituting the given values for L, R, C and e(t) gives 
da, 160 ® + 10% = 20 
dr dt 


Taking Laplace transforms throughout leads to the equation 
(s* + 160s + 10*)O(s) = [sq(0) + G(0)] + 160q(0) + 20 
S 
where O(s) is the transform of q(t). We are given that q(0) 2 0 and q(0) 2 i(0) = 0, so 
that this reduces to 


20 


S 


(s? + 160s + 10*)Q(s) = 


that is, 


20 
= 
s(s + 1605+ 10°) 


Resolving into partial fractions gives 


1 


500 + 160 
Q(s) 2 39 - лее rc DUI 
s os 4160s + 10° 


‚|1 (5+80) +3060) |1 Js + 4x 60 
"s (5480) +(60)] fs | +607] 


Taking inverse transforms, making use of the shift theorem (Theorem 5.2), gives 
q(t) = 35 (1 — e **' cos 60r — 2e *" sin 602) 


The resulting current i(t) in the circuit is then given by 


i(t) = gi - 187" sin 60r 
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Example 5.28 


Figure 5.9 
Parallel circuit of 
Example 5.28. 


Solution 


Note that we could have determined the current by taking Laplace transforms in (5.26). 
Substituting the given values for L, R, C and e(t) and using (5.26) leads to the trans- 
formed equation 


4 

16008) + 05) 4. 19- 1(5) = 29 

s 5 
that is, 
= 20 = Е 
I9) (o0) see 4(0)=0) 
(s? 4 80) 4 60? 

which, on taking inverse transforms, gives as before 

i(t) 2 167" sin 60r 


In the parallel network of Figure 5.9 there is no current flowing in either loop prior to 
closing the switch at time t 2 0. Deduce the currents 7,(¢) and i(t) flowing in the loops 
at time f. 


R,-200 Lj 20.5H 






e(t) = 200V () 


Applying Kirchhoff's first law to node X gives 
і= +1 
Applying Kirchhoff's second law to each of the two loops in turn gives 


Ré, i5) 9 L, da, +) + А1 = 200 


i — + Ryi,— Rj,- 0 


Substituting the given values for the resistances and inductances gives 


Ts T. 56i, * 40i; — 400 
d (5.27) 
d 9105-0 


Taking Laplace transforms and incorporating the initial conditions i,(0) = (0) = 0 
leads to the transformed equations 


(s + 56)](5) + ( + 4005) = “ (5.28) 


81 (5) + (+ 10)(5) = 0 (5.29) 
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Example 5.29 


Figure 5.10 
Circuit of 
Example 5.29. 


Solution 


Hence 


LG)- 3200 _ 3200 
s(s°+74s +880)  s(s-- 59.1) (s-- 14.9) 


Resolving into partial fractions gives 


3.64, 122 486 
$ $4591 5149 





D(s)- 


which, on taking inverse transforms, leads to 
(0) = 3.64 + 1.22 e?! 4.86е714% 
From (5.27), 


i(t) = (10+ ап) 


that is, 
i(f) 2 4.55 — 7.49 e ??!! 4 2,98 e 1^? 


Note that as t — ce, the currents 7,(f) and i;(f) approach the constant values 4.55 and 3.64 
A respectively. (Note that i(0) = 7,(0) + (0) z 0 due to rounding errors in the calculation.) 


A voltage e(f) is applied to the primary circuit at time ¢ = 0, and mutual induction 
M drives the current 7,(¢) in the secondary circuit of Figure 5.10. If, prior to closing 
the switch, the currents in both circuits are zero, determine the induced current i(t) 
in the secondary circuit at time / when R, 240, 5; 2100, L, 22H, L,- 8H, 
M - 2 H and e(t) 2 28sin 2t V. 





Applying Kirchhoff's second law to the primary and secondary circuits respectively gives 


Ri, +L; di, +м®- - e(t) 


dt 
di di 
К +1 2+ М = 0 
m gi dr 
Substituting the given values for the resistances, inductances and applied voltage leads to 
29 un di; : 
+41 +2 == = 28 510 27 
атш 
2 dh , a di + 10i, - 0 


“т dt 
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5.4.2 


Figure 5.11 
Constituent elements 
of a translational 
mechanical system. 


Taking Laplace transforms and noting that 7,(0) = 7,(0) = 0 leads to the equations 


(s + 2)/,(s) + sL(s) = 





28 (5.30) 
4 


s 
sl(s)+ (4s + 5)L(s)=0 (5.31) 
Solving for /,(s) yields 


_ 28s 
(35+ 10)(5+ 1)(5 + 4) 
Resolving into partial fractions gives 


h(s) = 


45 4 
L(s)2-——.—— 
3s+10 s+1 


Taking inverse Laplace transforms gives the current in the secondary circuit as 





s - 26 
tz ; 
s +4 


-107/3 


К шаі 15 7 91 as 
(ft) -$e - ве + g; COS 21 - x: sin 2f 


As t — ce, the current will approach the sinusoidal response 


i,(t) 7 $ cos 21 - 2 sin 2t 


Mechanical vibrations 


Mechanical translational systems may be used to model many situations, and involve 
three basic elements: masses (having mass M, measured in kg), springs (having spring 
stiffness K, measured in Nm !) and dampers (having damping coefficient B, measured 
in Nsm !). The associated variables are displacement x(t) (measured in m) and force 
F(t) (measured in N). Conventionally, the basic elements are represented symbolically 
as in Figure 5.11. 


ru p» pos p t 

| | | | 

l 
F х Е | К ок Е і | Е 
— i «———— - > 


(a) Mass (c) Damper 


(b) Spring 


Assuming we are dealing with ideal springs and dampers (that is, assuming that they 
behave linearly), the relationships between the forces and displacements at time f are: 


mass: = мб — MX  (Newton's law) 
f 
spring;  F- K(x;—x)) (Hooke's law) 


dx, dx "a^ 
damper: F = 3( 3 - x) = B(x, - Xj) 
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Example 5.30 


Figure 5.12 
Mass-spring— damper 
system of 

Example 5.30. 


Solution 


Using these relationships leads to the system equations, which may then be analysed 
using Laplace transform techniques. 


The mass of the mass-spring-damper system of Figure 5.12(a) is subjected to an 
externally applied periodic force F(t) = 4 sin wt at time ¢ = 0. Determine the resulting 
displacement x(f) of the mass at time f, given that x(0) — x(0) — 0, for the two cases 


(a 0-22 (b) 0-5 


In the case @ = 5, what would happen to the response if the damper were missing? 


Fi()-Kx() F0 Bx(0) 





F(r) - 4si 
WU sie F(t) = 4 sin cot 





H9 (а) (b) 


As indicated in Figure 5.12(b), the forces acting on the mass M are the applied force 
F(t) and the restoring forces F, and F, due to the spring and damper respectively. Thus, 
by Newton’s law, 


MX(t) = F(t) — Fi(t) - FX) 
Since M= 1, F(t) =4 sin ot, F,(t) = Kx(t) = 25x(t) and F,(t) = Bx(t) 2 6x(t), this gives 
X(t) + 6X(t) + 25x(t) = 4 sin wt (5.32) 


as the differential equation representing the motion of the system. 
Taking Laplace transforms throughout in (5.32) gives 


(s? + 6s + 25)X(s) = [sx(0) + X(0)] + 6x(0) + 2 





2 2 
5 +0 


where X(s) is the transform of x(t). Incorporating the given initial conditions 
x(0) = x(0) = 0 leads to 


1 (5.33) 
(= +0 )(= +65+ 25) 
In case (a), with @ = 2, (5.33) gives 


8 


BOE E UE UT 
(5° + 4)(s 4 65425) 
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which, on resolving into partial fractions, leads to 


ee Cs eee 
Бит 
а БАЗА 8(s+3)-4 
5 +4  (s+3) +16 


Taking inverse Laplace transforms gives the required response 
x(t) = 4 (7 sin 2t - 4 cos 21) + 4 €™(8 cos 4t - sin 4r) (5.34) 


In case (b), with @= 5, (5.33) gives 








X(s) * ————À——— (5.35) 
(з + 25)(з + бз+25) 
{һаї 15, 
Xs) = “iS, 1 2(s+3) +6 
52625 "(sc3) 416 
which, on taking inverse Laplace transforms, gives the required response 
x(t) 2 - £ cos 51 - t 8" (2 cos 4t + 3 sin 4t) (5.36) 
If the damping term were missing then (5.35) would become 
X(s)= —2— (5.37) 
(5^ + 25) 
By Theorem 5.3, 
1 соз 51} = ad gras 51} = -&( 7 E ) 
ds 5 05+ 25 
that is, 
2 
L{t cos St} =- : cM v 
S425 (s-25 5 +25 (s 425) 
= ! £{sin 5t}- —20— 
(8+ 25) 
Thus, by the linearity property (5.11), 
Pi} sin 5t - t cos 5t} = E 
(= +25) 


so that taking inverse Laplace transforms in (5.37) gives the response as 


x(t) = Z (sin 5t - 5t cos 51) 
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Example 5.31 


Figure 5.13 
Two-mass system of 
Example 5.31. 


Solution 





Because of the term f cos 5t, the response x(f) is unbounded as t — oo. This arises 
because in this case the applied force F(t) = 4 sin 5¢ is in resonance with the system 
(that is, the vibrating mass), whose natural oscillating frequency is 5/21 Hz, equal to 
that of the applied force. Even in the presence of damping, the amplitude of the system 
response is maximized when the applied force is approaching resonance with the sys- 
tem. (This is left as an exercise for the reader.) In the absence of damping we have the 
limiting case of pure resonance, leading to an unbounded response. As noted in Sec- 
tion 10.10.3 of Modern Engineering Mathematics, resonance is of practical importance, 
since, for example, it can lead to large and strong structures collapsing under what 
appears to be a relatively small force. 


Consider the mechanical system of Figure 5.13(a), which consists of two masses M, = 1 
and M, = 2, each attached to a fixed base by a spring, having constants K, = 1 and 
K, = 2 respectively, and attached to each other by a third spring having constant K, = 2. 
The system is released from rest at time ¢ = 0 in a position in which M, is displaced 
1 unit to the left of its equilibrium position and M, is displaced 2 units to the right of its 
equilibrium position. Neglecting all frictional effects, determine the positions of the 
masses at time f. 


Fy = Кх) -х)) 


Fi = Kx Рз = K3x2 
sepa pee sepa pes 
Ve | Г 
І l 


L L Ly ы 
х1(0) x(t) Xi) Xj(t) 


(a) (b) 


Let x,(t) and x,(t) denote the displacements of the masses M, and M, respectively from 
their equilibrium positions. Since frictional effects are neglected, the only forces acting 
on the masses are the restoring forces due to the springs, as shown in Figure 5.13(b). 
Applying Newton's law to the motions of M, and M, respectively gives 


Mx, = ЕЁ, — РЕ, = К, — ху) — Kix, 

Mž = -F3 — F, = -Kx — K — x) 
which, on substituting the given values for M,, M;, K;, K; and K,, gives 

X, + 3x, —2x,=0 (5.38) 

2X, + 4x, — 2x, =0 (5.39) 
Taking Laplace transforms leads to the equations 

(52 + 3) (5) — 2X,(s) = sx,(0) + ¥,(0) 

—X(s) * (s? + 2)X3(s) = 5х,(0) + %,(0) 
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Since x,(£) and x;(t) denote displacements to the right of the equilibrium positions, we 
have x,(0) =—1 and x,(0) = 2. Also, the system is released from rest, so that x,(0) = x,(0) 
= 0. Incorporating these initial conditions, the transformed equations become 


(s* + 3)X((s) — 2X,(s) = —-s (5.40) 

-X (s) + (s* + 2)X,(s) = 2s (5.41) 
Hence 

PA OI 2s 45s 


(s 4 4)(s? 4 1) 


Resolving into partial fractions gives 


5 M 
+ 


+1 з +4 








Х(5) = 


which, on taking inverse Laplace transforms, leads to the response 


x,(t) = cost * cos 2t 


Substituting for x;(f) in (5.39) gives 


x(t) = 2x,(t) + X,(t) 


=2cost+2cos2t—cost—4cos 2t 


that is, 


x(t) = cos t — 2 cos 2t 


Thus the positions of the masses at time f are 


x(t) 7 cos t — 2 cos 2t, 


5.4.3 Exercises 


x(t) = cos t + cos 2t 


Check your answers using MATLAB or MAPLE whenever possible. 


Use the Laplace transform technique to find the 
transforms /,(s) and L;(s) of the respective currents 
flowing in the circuit of Figure 5.14, where i,(£) is 
that through the capacitor and i,(¢) that through 


50 uF 


XA 


t=0 





E sin 100t 1000 


Figure 5.14 Circuit of Exercise 7. 


the resistance. Hence, determine i(t). (Initially, 
i,(0) = i,(0) = q,(0) = 0.) Sketch i;(t) for large 
values of t. 


At time f — 0, with no currents flowing, a voltage 
v(t) = 10 sin ¢ is applied to the primary circuit of 
a transformer that has a mutual inductance of 1 H, 
as shown in Figure 5.15. Denoting the current 
flowing at time ¢ in the secondary circuit by i,(4), 
show that 


10s 


LL i,(t)} = ——— 
(s +7s+6)(s +1) 


10 


ИП 
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2Q 





ae ] 20 


чо 
М=1Н 


v(t) 2 JO sint 


Figure 5.15 Circuit of Exercise 8. 


and deduce that 


: = -6 . 
(0) = -еє'+ е ‘+3 cost +3 sint 


In the circuit of Figure 5.16 there is no energy 

stored (that is, there is no charge on the capacitors 

and no current flowing in the inductances) prior to 

the closure of the switch at time t = 0. Determine 

i,(t) for t > 0 for a constant applied voltage 12 
Е,= 10 V. 





Figure 5.16 Circuit of Exercise 9. 


Determine the displacements of the masses M, and 
M, in Figure 5.13 at time t > 0 when 


M,=M,=1 


K,=1,K)=3 and K,=9 


What are the natural frequencies of the 
system? 


When testing the landing-gear unit of a space 
vehicle, drop tests are carried out. Figure 5.17 is a 
schematic model of the unit at the instant when it 
first touches the ground. At this instant the spring 
is fully extended and the velocity of the mass is 
\@2gh), where A is the height from which the 

unit has been dropped. Obtain the equation 
representing the displacement of the mass at 

time ¢ > 0 when M = 50 kg, B= 180 N sm! and 





Figure 5.17 Landing-gear of Exercise 11. 


К = 474.5 Nm’, and investigate the effects of 
different dropping heights A. (g is the acceleration 
due to gravity, and may be taken as 9.8 m s?.) 


Consider the mass-spring-damper system 
of Figure 5.18, which may be subject to two 
input forces u,(t) and u,(t). Show that the 
displacements x,(t) and x,(t) of the two masses 
are given by 





xxt) 


Figure 5.18 Mechanical system of Exercise 12. 


2 
OR gee U,(s) +2: wo) 


Mis B,s +K 
„ош = ет ро) 


x(t) = ge U,(s) * x 


where 


A =(M,s? + Bys + K,)(M,s? + Bys + К.) - Bl? 
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Step and impulse functions 


5.5.1 The Heaviside step function 


Figure 5.19 
Heaviside unit 
step function. 


Figure 5.20 
Piecewise-continuous 
function. 


In Sections 5.3 and 5.4 we considered linear differential equations in which the forcing 
functions were continuous. In many engineering applications the forcing function may 
frequently be discontinuous, for example a square wave resulting from an on/off 
switch. In order to accommodate such discontinuous functions, we use the Heaviside 


unit step function H(t), which, as we saw in Section 5.2.1, is defined by 
< 

H(t) = 0 (t<0) 

1 (t= 0) 


and is illustrated graphically in Figure 5.19(a). The Heaviside function is also fre- 
quently referred to simply as the unit step function. A function representing a unit step 
at t = a may be obtained by a horizontal translation of duration a. This is depicted 
graphically in Figure 5.19(b), and defined by 


0 (t<a) 
1 (t2a) 


ни-а)=} 





(b) 


The product function f(t)H(t — a) takes values 
0 (t «€ a) 
f(t) (t= a) 


so the function H(t — a) may be interpreted as a device for ‘switching on’ the function 


foone-«) = 


f(t) at t= a. In this way the unit step function may be used to write a concise formula- 


tion of piecewise-continuous functions. To illustrate this, consider the piecewise- 
continuous function f(t) illustrated in Figure 5.20 and defined by 


fü) 





Figure 5.21 
Top hat function. 
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Л) (0zr«n) 
S(t) =sfAt) (nst«t) 
A(t) (t 2 5) 
To construct this function f(t), we could use the following ‘switching’ operations: 


(a) switch on the function f(t) at t = 0; 
(b) switch on the function f(t) at t = t, and at the same time switch off the function 


Ai); 


(c) switch on the function f(t) at t = t, and at the same time switch off the function 
hO. 
In terms of the unit step function, the function f(t) may thus be expressed as 
S(t) =AOAD + [AM -AOIG- 6) * DG) - 60))HB - h) 
Alternatively, f(f) may be constructed using the top hat function H(t — a) — H(t — b). 
Clearly, 
1 (a<t<b) 


: (5.42) 
0 otherwise 


H(t-a)- H(t- b) = | 


which, as illustrated in Figure 5.21, gives 
t <t<b 
fita a ia - UO (a ) 


0 otherwise 


H(t — a) — H(t — b) 





о 1—63 
Using this approach, the function f(t) of Figure 5.20 may be expressed as 
Л) = ЛОНО) – На 6)] * fO) - 6) - B - 6) + AOH(E- 6) 
giving, as before, 
SO SAAHA + [AO -AOIE- 4) + O -OIE - t) 
It is easily checked that this corresponds to the given formulation, since for 0 « f « f, 
Ht) — 1, H(t — t) 2 H(t— 6) 20 
giving 
SOSA O<t<t) 
while for 4 € t «€ t, 
Ht) — H(t — 1j) — 1, A(t — tr) =0 
giving 


fQ) — f) * L6) — 0) - 0) (4 St <b) 
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and finally for t > t, 
H(t) - H(t - t) - Н(—1,) = 1 
giving 


f) — fit) * Lft) - iG) * LG) 50] — 0) (7 5) 


Example 5.32 Express in terms of unit step functions the piecewise-continuous causal function 
2 


2t (0 <t< 3) 
f(t) =41+4 (3«t«5) 
9 (t = 5) 
Figure 5.22 f) 
Piecewise-continuous 


function of 
Example 5.32. 





Solution f(t) is depicted graphically in Figure 5.22, and in terms of unit step functions it may be 
expressed as 


К) = 22Н(0) 9 (t 4 — 22)H(t— 3) - (9 - t - A)H(E— 5) 
That is, 





f(t) 2 28H(t) * (4 - t — 22)H(t — 3) - (5 - AH- 5) 


Example 5.33 Express in terms of unit step functions the piecewise-continuous causal function 


0 (t<1) 

1 (ls<t<3) 
f(t = 13 (<t<5) 

2 (5<t< 6) 

0 (t= 6) 


Solution f(t) is depicted graphically in Figure 5.23, and in terms of unit step functions it may be 
expressed as 


fi 2 1H(t- D + (3 – D)H(t- 3) + 2 —3)H(t— 5) + (0 — 2)H(t— 6) 
That is, 
f(t) = 1H(t— 1) + 2H(t— 3) — 1H(t— 5) - 2H(t— 6) 
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Figure 5.23 


Piecewise-continuous 


function of 
Example 5.33. 


fe 





5.5.2 Laplace transform of unit step function 


By definition of the Laplace transform, the transform of H(t — a), a = 0, is given by 








sai | H(t - a)e "dt — | тан le“ dt 
0 0 a 
_ ё m = e^ 
-sS х S 
That is, 
op m) (5.43) 





5 


and in the particular case of a = 0 


PHO} = : (5.44) 


This may be implemented in MATLAB using the commands 


syms s t 
H=sym(‘Heaviside(t)’) 
1ар1асе (Н) 


which return 
ans-1/s 

It may also be obtained directly using the command 
laplace(sym(‘Heaviside(t) ’) ) 

Likewise to obtain the Laplace transform of H(t-2) we use the commands 


H2-sym('Heaviside(t-2)"') 
laplace (H2) 
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Example 5.34 


Solution 


fo 
K | — 


о а bt 


Figure 5.24 
Rectangular pulse. 


Example 5.35 


Solution 


which return 
ans-exp(-2*s)/s 
In MAPLE the results are obtained using the commands: 


with(inttrans): 
laplace(Heaviside(t),t,s); 
laplace(Heaviside(t-2),t,s); 


Determine the Laplace transform of the rectangular pulse 


0 (t<a) 
f(t) - 4K (ast«b) Kconstant, b>a>0 
0 (tab) 


The pulse is depicted graphically in Figure 5.24. In terms of unit step functions, it may 
be expressed, using the top hat function, as 


S(t) = K [Ht — a) - H(t— b)] 
Then, taking Laplace transforms, 
ZU) - KZ(Ut - a); - KZ(U( - b) 


which, on using the result (5.24), gives 


-bs 


ФР) = К e -ge 
M S 
That is, 


LUO} = ‘ ("ae") 


Determine the Laplace transform of the piecewise-constant function f(f) shown in 
Figure 5.23. 


From Example 5.33 f(t) may be expressed as 
f(t) » AA(t— 1) * 2H(t — 3) - AR(t- 5) - 2H(t— 6) 
Taking Laplace transforms, 
SLU f(t)} = 1LLA(t — 1)} + 2L{ H(t — 3)} — 1 LL A(t — 5)} — 2L{ A(t — 6)} 


which, on using the result (5.43), gives 


-s -3s -5s -6s 
Sp f(t)) 2 -428— - £— -29— 
S M S S 


That is, 


S f()) - le" eee gom 
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[=] Check that the same answer is obtained using the MATLAB sequence of commands 


syms s t 

Hl=sym(‘Heaviside(t-1)’); 
H3=sym(‘Heaviside(t-3)’); 
H5=sym(‘Heaviside(t-5)’); 


1 


H6=sym( ‘Heaviside (t-6)’) 
її (Isr—2 1909 115 — 2 н) 
In MAPLE the commands 
with(inttrans): 
laplace(Heaviside(t-1)-«Heaviside(t-3)*2 - Heaviside(t-5) 
ESL rete) RSEN 
return the answer 


еб 4 25039 — 259) _ 9,68) 





S 


5.5.3 The second shift theorem 


This theorem is dual to the first shift theorem given as Theorem 5.2, and is sometimes 
referred to as the Heaviside or delay theorem. 


Theorem 5.4 If Z(f(t)) — F(s) then for a positive constant a 


4 f(t — aH(t — a)) 2 e*F(s) 


Proof By definition, 


оо 


JUf(t - aMH(t-a)) - | f(t - a)R(t - a)e * dt 


0 


- | f(t- a)e "dt 


a 


Making the substitution T= t — a, 


со 


LY f(t- a)H(t- a)} = | f(T) eth) ar 


0 


= e] JO." dT 


оо 


Since F(s) = H f(O} = | f(T) e", it follows that 


JU — a)R( — a); - e ?F(s) 


end of theorem 
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It is important to distinguish between the two functions f(f)H(t — a) and f(t — a)H(t — a). 
As we saw earlier, f(t)H(t — a) simply indicates that the function f(t) is ‘switched on’ 
at time t = a, so that 
0 (t «€ a) 

SA (>a) 


On the other hand, f(t — a)H(t — a) represents a translation of the function f(t) by a units 
to the right (to the right, since a > 0), so that 


SOH- a) = | 


fa -a)HQ- a) - b l2 

f(t-a) (tz a) 
The difference between the two is illustrated graphically in Figure 5.25. f(t — a)H(t — a) 
may be interpreted as representing the function f(t) delayed in time by a units. Thus, when 
considering its Laplace transform e ^F(s), where F(s) denotes the Laplace transform of 
f(t), the component e“ may be interpreted as a delay operator on the transform F(s), 
indicating that the response of the system characterized by F(s) will be delayed in time 
by a units. Since many practically important systems have some form of delay inherent 
in their behaviour, it is clear that the result of this theorem is very useful. 


fam) ЛОНО ~ а) f(t - a)H(t — a) 





о а t О а t О а t 
Figure 5.25 Illustration of f(t — a)H (t — a). 


Example 5.36 Determine the Laplace transform of the causal function f(t) defined by 


| Jt (0xrt«b) 
ло = |; MS 


Solution f(t) is illustrated graphically in Figure 5.26, and is seen to characterize a sawtooth pulse 
of duration b. In terms of unit step functions, 


fo f(t) = tH(t) — tH(t — b) 


b In order to apply the second shift theorem, each term must be rearranged to be of the 
form f(t — a)H(t — a); that 1s, the time argument t — a of the function must be the same 
as that of the associated step function. In this particular example this gives 


9 b t f(t) = tH(t) — (t — b)E(t — b) — bH(t — b) 


Figure 5.26 
Sawtooth pulse. 


Taking Laplace transforms, 


ASO} = uH; - i - b)HG – Б) – БАН - b) 


Example 5.37 


Solution 
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which, on using Theorem 5.4, leads to 








1 hs EU Iq ou e 
ЕЕ pem eme 
S S S S S 
giving 
1 -DS -DS 
HASO - La-e*y- 5e 
S S 


It should be noted that this result could have been obtained without the use of the 
second shift theorem, since, directly from the definition of the Laplace transform, 


x b 


sum | лоста etas | Oe “dt 
0 b 


0 














|| 
[—— c— 04 
~ 
uw IO 
d 
l 
© © 
+ 
— 
о > 
Фф 
wr 
а 
c 
= 
Il 
ГЕ s 
I 
~ 
«s |o 
z 
= 
І 
Ф |0, 
N| с 
p. X 
e o 


be" e 1 1 -bs b -bs 
QE E) Ue ee 
S $ $ $ $ 








as before. 


Obtain the Laplace transform of the piecewise-continuous causal function 


20 | (0x1«3) 
JG) = \{+4 (3ж<=гт<5) 
9 (ї > 5) 


considered in Example 5.32. 


In Example 5.32 we saw that f(t) may be expressed in terms of unit step functions as 


f(t) 22P0H(t) - 20 — t 





A)H(t — 3) — (t— 5)H(t— 5) 


Before we can find Y{ f(t)}, the function 27? — t — 4 must be expressed as a function of 
t — 3. This may be readily achieved as follows. Let z 2 t — 3. Then 


22-1-4= 2(2+ 3) – (2+3) – 4 
= 22? + 112+ 11 
= 2(t- 3) + 11(¢-3) +11 
Hence 
f) 2 20R(t) — [2(( — 3. 9 10(£ — 3) - 1]A(t — 3) — (t — 5) H(t - 5) 
Taking Laplace transforms, 
Jf(t)) 2 221? H(t)) — Z((t- 3Y - 11(— 3) - 11]A(t— 3)? 
- dt(t- 5)mt- 5) 
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which, on using Theorem 5.4, leads to 


22.e* 90r ele -e* gt] 
S 


Af} 


-5s 
-$ eiae 
S M 


2 
S S S 


Again this result could have been obtained directly from the definition of the Laplace 
transform, but in this case the required integration by parts is a little more tedious. 


(m Having set up s and £t as symbolic variables and specified H, H1 and H5 then the 

MATLAB commands 

Шар ые О UE MEQUE IN ЕБЕ) 59) 

pretty (ans) 
generate 

ans- 4/s?-11exp(-3s)/s-11exp(-3s)/s?^-4exp(-3s)/s?-exp(-5s)/s? 
In MAPLE the commands 

with(inttrans): 

laplace(Heaviside(t)*2*t^2 - Heaviside(t-3)*(2*t^2-t-4) 

= ает (16-15) (12-5) Е) 5 


return the answer 


e7 c Uus cells) 
a” + S 





5.5.4 Inversion using the second shift theorem 


We have seen in Examples 5.34 and 5.35 that, to obtain the Laplace transforms of 
piecewise-continuous functions, use of the second shift theorem could be avoided, 
since it is possible to obtain such transforms directly from the definition of the Laplace 
transform. 

In practice, the importance of the theorem lies in determining inverse transforms, 
since, as indicated earlier, delays are inherent in most practical systems and engineers 
are interested in knowing how these influence the system response. Consequently, by 
far the most useful form of the second shift theorem is 


Le F(s)} = ft- DH- a) (5.45) 


Comparing (5.45) with the result (5.12), namely 
L'F(s)} = fH(t) 


we see that 


S [e F(s) 2 Lf()H(t)] with t replaced by t- a 


= 


Example 5.38 


Solution 


Е) 
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indicating that the response f(t) has been delayed in time by a units. This is why the 
theorem is sometimes called the delay theorem. 


This is readily implemented in MATLAB using the command ilaplace. 


-4s 
Determine Z~ | Ae s | 


This may be written as £ ! (e *F(s)!, where 


= 4 
5(5+2) 





F(s) 
First we obtain the inverse transform f(t) of F(s). Resolving into partial fractions, 


F(s) = 2 


222 
5+2 
which, on inversion, gives 


f(t)22-2e* 


a graph of which is shown in Figure 5.27(a). Then, using (5.45), we have 





-1| -4у 4 _ фр=1 4s = т _ 
X fe z [le “F(s)} = f(t-4)AH(t- 4) 


= (2-267) Hr — 4) 


giving 


e mz |- | (<4) 
ss-2) |2ü-e7*9) (te 4) 


which is plotted in Figure 5.27(b). 


Using MATLAB confirm that the commands 


ilaplace (4*exp(-4*s)/(s*(s+2))); 
pretty (ans) 


generate the answer 
ZEA) (exo S2 roi OD 
The same answer is obtained in MAPLE using the commands 


with(inttrans): 
invlaplace (4*exp(-4*s)/(s*(s+2)),s,t); 
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Figure 5.27 Inverse 
transforms of 
Example 5.38. 


Example 5.39 


Solution 


ғо 


o | 2 3 4 5 6 7 8 9t 


(a) Graph of f(1) 
f(t — 4)H(t — 4) 
2 


о 1 2 3 4 5 6 7 8 9 t 


(b) Graph of f(t - 4)H(t — 4) 


Determine zu EUIS | 


s(s^ 4-1) 


This may be written as £ ! (e "F(s)!, where 


5+ 3 


F(s)  — 
s(s +1) 





Resolving into partial fractions, 


Pajera 
s s*l s-«l 


which, on inversion, gives 
f(t) 2 3—3cost- sint 
a graph of which is shown in Figure 5.28(a). Then, using (5.45), we have 


-а)е “(+3 
g | ' 
s(s +1) 


| - £' [e"F(s)) 2 f(t- 1)H(t- n) 


= [3 — 3 cos (t — T) * sin (t — 1)]H(t — T) 


= (3 3cost — sin f) H(t — v) 
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f 





-5 
12 14 t 


e 
N 
a 
> 
a 
oo 
© 


(а) 
f(t — n)H(t — 1) 
10 





-5 
10 12 i4 t 


o 
N 
a 

+ 
с 
оо 


(b) 


Figure 5.28 Inverse transforms of Example 5.39. 


giving 


germ = | (t € т) 


s(s’ + 1) 3 +3 cosż- sint (f — m) 


which is plotted in Figure 5.28(b). 


5.5.5 Differential equations 


We now return to the solution of linear differential equations for which the forcing 
function f(f) is piecewise-continuous, like that illustrated in Figure 5.20. One 
approach to solving a differential equation having such a forcing function is to solve 
it separately for each of the continuous components fi(t), f?(1), and so on, comprising 
f(t), using the fact that in this equation all the derivatives, except the highest, must 
remain continuous so that values at the point of discontinuity provide the initial con- 
ditions for the next section. This approach is obviously rather tedious, and a much 
more direct one is to make use of Heaviside step functions to specify f(t). Then the 
method of solution follows that used in Section 5.3, and we shall simply illustrate it 
by examples. 
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Example 5.40 


Solution 


Method 1 


Obtain the solution x(t), t = 0, of the differential equation 


dx dx _ 
—>+5—=+ 6x = f(t) (5.46) 
dt dt 


where f(t) is the pulse function 


3 (0<1t<6) 


Ko = |2 к 


and subject to the initial conditions x(0) = 0 and x(0) = 2. 


To illustrate the advantage of using a step function formulation of the forcing function 


f(t), we shall first solve separately for each of the time ranges. 


For 0 x t « 6, (5.46) becomes 


2 
= +5 dr +6x=3 
dt dt 
with x(0) = 0 and x(0) = 2. 
Taking Laplace transforms gives 





(82+ 5s + 6)X(s) = sx(0) + (0) + 5х(0) + 2 =2+ 2 
S S 
That is, 
1 1 
Xs) = t3 -24 E _ l 
s(s+2)(s+3) 5 s+2 s+3 


which, on inversion, gives 
x(th=3the"-e" (0<t< 6) 


We now determine the values of x(6) and x(6) in order to provide the initial conditions 
for the next stage: 


х(6) =1+1е7°- е =a, x(6)=-e"+3e%=8 


Fort z 6 we make the change of independent variable T= t — 6, whence (5.46) becomes 
2 
Gx 15 OX 6 = 0 
dT 


subject to x(T 2 0) 2 a and x(T= 0) = f. 
Taking Laplace transforms gives 


(5? + 55 + 6)X(s) = sx(T = 0) + Х(Т= 0) + 5х(Т= 0) = às - 5a B 
That is, 


X(s) = as+Sa+B  Bt+3a_ Bt+2a 


~ (s+2)(s+3) 5+2 +3 


Figure 5.29 
Forcing function 
and response of 
Example 5.40. 


Method 2 
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which, on taking inverse transforms, gives 
x(T) = (B 3o)e?" - (B4 20)e?T 

Substituting the values of a and f and reverting to the independent variable t gives 
х) = ($4 1e92?*9 (1.96195 (6) 

That is, 
x(t) 2 (1e? — e?) - (39?*9 — g3*9) (a2 6) 

Thus the solution of the differential equation 1s 


itie” 
x(t) = 


1 .-2t -3t 3 -2(r-6) -3(t-6) 
5е -е`)+(5е -е ) 


t 3t 


-e (0 < т< б) 


(t = 6) 


The forcing function f(t) and response x(t) are shown in Figures 5.29(a) and (b) 
respectively. 











SOA x(t) A 
a6 + 
6+ 
044 
— 
0.2 
у м. ‚с {Ташы 
о 3з 6 9 2 t о 2 4 6 8 10t 
(a) (b) 


In terms of Heaviside step functions, 
f(t) = 3H(t) — 3H(t — 6) 
so that, using (5.43), 


3 3 -6s 
== 
S S 


АЎ) ъ= 
Taking Laplace transforms in (5.46) then gives 

(s? + 5s + 6)X(s) = эх(0) + Х(0) + 5х(0) + LLf(D} = 2+ : 8 кө 
That is, 


Xs) = 25+3 - e“ 3 
s(s+2)(s+3) s(s+2)(s+3) 


14 13 
sez ee 2 2l ) 
5 5+2 5+3 5 5+2 5+3 


Taking inverse Laplace transforms and using the result (5.45) gives 








x(t) 2 (5 € 5e? - e?) - (5 - 3e? + A(t - 6) 
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which is the required solution. This corresponds to that obtained in Method 1, since, 
using the definition of H(t — 6), it may be written as 


eT (0 <1<6) 
x(t) = 


Ges ge + G e209 е?©®, (t2 6) 


This approach is clearly less tedious, since the initial conditions at the discontinuities 
are automatically taken account of in the solution. 


(m It seems that the standard dsolve command is unable to deal with differential 
equations having such Heaviside functions as their forcing function. To resolve this 
problem use can be made of the maple command in MATLAB, which lets us 
access MAPLE commands directly. Confirm that the following commands produce 
the correct solution: 





парте (“ае-= а (5 (2) 252) 35 ае (502) 2) 62 (09) 
-3*Heaviside-3*Heaviside(t-6);") 

ans= 

ае = анас ОЗЮ) Ose (te) 





- 3*Heaviside-3*Heaviside(t-6) 
maple('dsolve((de,x(0)20,D(x) (0)22),x(t)),method-laplace;"') 


In MAPLE the answer may be obtained directly using the commands: 


with(inttrans): 

(esc einst Р (зе E E CE SS) 
-3*Heaviside-3*Heaviside(t-6); 

dsolve({de,x(0)=0,D(x) (0)=2},x(t)),method=laplace; 


Example 5.41 Determine the solution x(t), t > 0, of the differential equation 


fr 4284 5x =й (547) 
where 
(Jt (0xtxm) 
Ko = lo m 


and subject to the initial conditions x(0) = 0 and x(0) = 3. 


Solution Following the procedures of Example 5.36, we have 
S(t) = tH(t) — tH(t - 7) 
= tH(t) — (t — m)H(t-— x) — NH(t — v) 


so that, using Theorem 5.4, 





1| e" me" 1 sf 1 
Sur - 5-6 - 80 = 5-е" (1+0) 
5 


5 5 5 M 


5.5.6 
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Taking Laplace transforms in (5.47) then gives 
(52 + 25 + 5)Х(ѕ) = 5х(0) + х(0) + 2х(0) + 2{ /()} 


using the given initial conditions. 
Thus 


3541 Lg 1+57 


X(s) = —3———— — >In 
s’(s+2s+5) s^(s) 4 254 5) 


which, on resolving into partial fractions, leads to 


XG) ipee utu I 


“ls = (ety + 25| s sy (s 1) 4 


s sS (5+1) +4 


_ sie 5+1 a 


25) s S (s+1) +4 


_ Ta | (51 -2)y(s m 
Taking inverse Laplace transforms and using (5.45) gives the desired solution: 
x(f) 2 3: (—2 4 5t - 2e" cos 2t *- 36 e” sin 2f) 
- 4 [57 — 2) + 5(t л) - (51 - 2) e ^? cos2(t — n) 
— i(5n*3)e *? sin2(t — n)]H(t — T) 
That is, 
x(t) = [51 — 2 + 2е (соз 2{ + 18 sin 24] 
- & (5t- 2- e" e" [(5n — 2) cos 2r+ £ (57 + 3) sin 24] } H(t — л) 
or, in alternative form, 


» ал ѕіп20)] (0=г<дл) 
~ = 


A e'{(2 + (Sm - 2) е") cos 2¢+[36+4(5n+3)e"] sin 2t} (t= 7) 


Periodic functions 


We have already determined the Laplace transforms of periodic functions, such 
as sin wt and cos wt, which are smooth (differentiable) continuous functions. In many 
engineering applications, however, one frequently encounters periodic functions that 
exhibit discontinuous behaviour. Examples of typical periodic functions of practical 
importance are shown in Figure 5.30. 

Such periodic functions may be represented as infinite series of terms involving step 
functions; once expressed in such a form, the result (5.43) may then be used to obtain 
their Laplace transforms. 
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Figure 5.30 fo 
Typical practically K 
important periodic 
functions: (a) square 
wave; (b) sawtooth 
wave; (c) repeated 
pulse wave; (d) half- 
wave rectifier. -K 








fq 





Example 5.42 Obtain the Laplace transform of the square wave illustrated in Figure 5.30(a). 


Solution In terms of step functions, the square wave may be expressed in the form 
f(t) = KH(t) - 2KH(t - 1T) - 2KH(t - T) -2KH(t — 3 T)  2KH(t - 2T) 8... 
= K[H(t) - 2H(t - 1T)  2H(t - T) - 2H(t — 3 T) 2H(t-2T)+...] 


Taking Laplace transforms and using the result (5.43) gives 
LU} = Fls) = x(t neg gee ame peer es. ) 
S 85 s s s 


_ 2К 1 Lg (gy _ (er)? ib (eTa —...] BS 
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The series inside the square brackets is an infinite geometric progression with first 
term 1 and common ratio —e ?7?, and therefore has sum (1 + e*””)'. Thus, 


2K 1 EK Xie" 
Е(з)= #®————-= = = —°— 


-sT/2 
S l+e° S $]+е 


That is, 


LI f(t)} = F(s) -TunhisT 


The approach used in Example 5.42 may be used to prove the following theorem, which 
provides an explicit expression for the Laplace transform of a periodic function. 


Theorem 5.5 If f(t), defined for all positive £, is a periodic function with period 7, that is 
f(t  nT) — f(t) for all integers n, then 


1 
1-е 





Lf fl} = = | с 


0 


Proof If, as illustrated in Figure 5.31, the periodic function f(t) is piecewise-continuous over 
an interval of length 7, then its Laplace transform exists and can be expressed as a 
series of integrals over successive periods; that is, 


оо 


Li flt)} = | Де“ 


0 


= | oean | gear | fioe" dt... 


0 T 2T 


nT 
* | fie" dt... 
(n-1)T 

If in successive integrals we make the substitutions 


t=T+nT (n=0,1,2,3,...) 


then 


Li flt)}= 5 | fit - nT)e? "Dar 


п=0 


Figure 5.31 Хо) 
Periodic function 
having period 7. 
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fien 


о Т t 
Figure 5.32 
Plot of periodic 
function within one 
period. 


Example 5.43 


Solution 





Since f(t) is periodic with period T, 
f(r* nT)-f(t) (n=0,1,2,3,...) 


so that 


LY} = È| Arje e" dt = » eJ ft) edt 


n=0 4 0 


—nT zl —2sT ЗУТ 


Тһе ѕегіеѕ У уе" = 1 не" не +e°" +... is an infinite geometric progression 
with first term 1 and common ratio e 7". Its sum is given by (1 — e^) !, so that 





LID} == | ft) edt 
к | 


Since, within the integral, tT is a ‘dummy’ variable, it may be replaced by t to give the 
desired result. 


end of theorem 


We note that, in terms of the Heaviside step function, Theorem 5.5 may be stated as 
follows: 
If f(t), defined for all positive ¢, is a periodic function with period T and 
Ai) =fOEO - B - T)) 
then 
Jf) 2 0-e7y ZO) 
This formulation follows since f(t) is periodic and f(t) = 0 for t > T. For the periodic 
function f(t) shown in Figure 5.31 the corresponding function f{(f) is shown in 


Figure 5.32. We shall see from the following examples that this formulation simplifies 
the process of obtaining Laplace transforms of periodic functions. 


Confirm the result obtained in Example 5.42 using Theorem 5.5. 


For the square wave f(f) illustrated in Figure 5.30(a), f(t) is defined over the period 
0<t<Tby 


K (<p 57) 


f=} 
В Оет) 


Hence we can write fi(t) 2 K[H(t) — 2H(t — iT) + H(t — T)], and thus 


S ft -x(! emule) = Ka к pup 
S S S S 
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Using the result of Theorem 5.5, 
K(1 _ ША К(1 _ $ed 
Lt ft} = = -sT. = = -sT/2 
s(l-e ) s(l-e +e") 


-sT/2 
=£l-e =F tanh i sT 
sl+e° s 


confirming the result obtained in Example 5.42. 
Example 5.44 Determine the Laplace transform of the rectified half-wave defined by 


iis ѕіп 01 (0 «t « m/o) 
` 0 (л/@ <1< 2т/о) 


/@ + 2nn/o) =/(ї) for all integers n 


Solution  f(f) is illustrated in Figure 5.30(d), with T = 27/œ@. We can express f(t) as 
f(t) = sin ot[H(t) — H(t — r/0)] 
— sin Ot H(t) + 5100 (7 – п/0)Н(1 – п/о) 


So 





e E 0 e -st/o 
LAO} = st OO a (1) 
5 +0 5 +0 S tO 


Then, by the result of Theorem 5.5, 


3if9) - 2,1577 - 2 


Sj а? lc ene (92+ 0) )(1 _ e we) 


5.5.7 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


13 A function f(t) is defined by ЗР (0 <= 4) 
{ (0<г<1) (a) fü) 2320-3 (4«t«6) 

Хд = |0 >т 5 — (t6) 
Express f(t) in terms of Heaviside unit step t (0<=/<1) 
functions and show that (b) g(f) 2 42-1 (1«t«2) 

0 (t > 2) 


Siu-ia-e»-ie 
S S 


15 Obtain the inverse Laplace transforms of the 





14 Express in terms of Heaviside unit step functions the following: 
following piecewise-continuous causal functions. ET -2s 
In each case obtain the Laplace transform of the (a) e (b) 2 


function. (s -2)' (5 + 3)(5 +1) 
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16 


17 


18 


19 


20 
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+1 -5 +1 -Ts 
(c) He (d) *+—e 
s(s +1) Ss +s+1 
(e) 4 $0 qns f e 1-е 
s +25 5(5 +1) 


Given that x = 0 when ¢ = 0, obtain the solution of 
the differential equation 
х х= Д0 (120) al 
dt 
where f(t) is the function defined in Exercise 13. 
Sketch a graph of the solution. 
Given that x = 1 and dx/dt = 0, obtain the solution 
of the differential equation 
2 
dx + E +x = g(t) 
d dt 


where g(t) is the piecewise-continuous function 
defined in Exercise 14(b). 


(t= 0) 


Show that the function 
0 (0 <t <n) 
лә={% Tt 
sint (tz 5n) 
may be expressed in the form f(t) = cos (t — in) 22 
H(t- im), where H(t) is the Heaviside unit step 
function. Hence solve the differential equation 
2 
Яху ох ди) 
dt dt 
where f(t) is given above, and x = 1 and 
dx/dt 2 —1 when t = 0. 


Express the function 
3 0=г<4 
fo- | | 
21-5 (124) 
in terms of Heaviside unit step functions and obtain 
its Laplace transform. Obtain the response of the 
harmonic oscillator 
¥+x=f(t) 
to such a forcing function, given that x = 1 and 
dx/dt = 0 when t= 0. 23 


The response 9,(t) of a system to a forcing function 
O(t) is determined by the second-order differential 
equation 

6,+60,+100,=0 (rz 0) 
Suppose that Of) is a constant stimulus applied for 
a limited period and characterized by 


ain =" (0<t<a) 
0 (t2a) 


24 


Determine the response of the system at time f 
given that the system was initially in a quiescent 
state. Show that the transient response at time 
T (7 a) is 

-3-e?Tícos T - 3sin T — e"[cos(T — a) 


10 
*t3sin(T—a)]) 





The input 0+) and output 6,(¢) of a servomechanism 
are related by the differential equation 


6, + 86,+160,=0 (t=0) 
and initially 0,(0) = 6,(0) = 0. For 0, = f(t), where 


Ó-£ (Orel) 


ml (t 91) 


Show that 


$60) -3- 1 Le* 
5 5 


and hence obtain an expression for the response of 
the system at time f. 


During the time interval ¢, to 4, a constant 
electromotive force e, acts on the series RC circuit 
shown in Figure 5.33. Assuming that the circuit is 
initially in a quiescent state, show that the current 
in the circuit at time f is 


€, --t)/RC 


0) = 50е H(t-t,)-e 


-(t-tj)/RC 


H(t- t,)] 


Sketch this as a function of time. 


R C 


e(t) p in) 


Figure 5.33 Circuit of Exercise 22. 


A periodic function f(t), with period 4 units, is 
defined within the interval 0 < г< 4 by 


3t (0<t<2) 


mh (2<г<4) 


Sketch a graph of the function for 0 « t « 12 and 
obtain its Laplace transform. 


Obtain the Laplace transform of the periodic 
sawtooth wave with period 7, illustrated in 
Figure 5.30(b). 


5.5.8 


Figure 5.34 
Impulse function. 
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The impulse function 


Suppose a hammer is used to strike a nail then the hammer will be in contact with the 
nail for a very short period of time, indeed almost instantaneously. A similar situation 
arises when a golfer strikes a golf ball. In both cases the force applied, during this short 
period of time, builds up rapidly to a large value and then rapidly decreases to zero. 
Such short sharp forces are known as impulsive forces and are of interest in many 
engineering applications. In practice it is not the duration of contact that is important 
but the momentum transmitted, this being proportional to the time integral of the force 
applied. Mathematically such forcing functions are represented by the impulse function. 
To develop a mathematical formulation of the impulse function and obtain some insight 
into its physical interpretation, consider the pulse function @(t) defined by 


0 (0<t<a-iT) 
@(t)=4A/T (a-1T<t<a+}T) 
0 (t= a+iT) 


and illustrated in Figure 5.34(a). Since the height of the pulse is 4/7 and its duration (or 
width) is T, the area under the pulse is A; that is, 


se а+Т/2 A 
g(t) dt = | —dt=A 
|. a-T/2 T 


If we now consider the limiting process in which the duration of the pulse approaches 
zero, in such a way that the area under the pulse remains A, then we obtain a formula- 
tion of the impulse function of magnitude A occurring at time ¢ = a. It is important to 
appreciate that the magnitude of the impulse function is measured by its area. 

The impulse function whose magnitude is unity is called the unit impulse function 
or Dirac delta function (or simply delta function). The unit impulse occurring at 
t = a is the limiting case of the pulse $(f) of Figure 5.34(a) with A having the value 
unity. It is denoted by ó(t — a) and has the properties 


ó(t-a)-0 (tza) 


| ó(t-a)dt-1 


Likewise, an impulse function of magnitude A occurring at t= a is denoted by Aó(t — a) 
and may be represented diagrammatically as in Figure 5.34(b). 

An impulse function is not a function in the usual sense, but is an example of a class 
of what are called generalized functions, which may be analysed using the theory of 


фа) AÓ(t — a) 


[> 





a-!Taa+3T t a t 


(a) (b) 
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5.5.9 


$10) 





T 
О а-Т а а+Т t T O T t 
Figure 5.35 Approximation to a unit pulse. Figure 5.36 Pulse at the origin. 


generalized calculus. (It may also be regarded mathematically as a distribution and 
investigated using the theory of distributions.) However, its properties are such that, 
used with care, it can lead to results that have physical or practical significance and 
which in many cases cannot be obtained by any other method. In this context it provides 
engineers with an important mathematical tool. Although, clearly, an impulse function 
is not physically realizable, it follows from the above formulation that physical signals 
can be produced that closely approximate it. 

We noted that the magnitude of the impulse function is determined by the area under 
the limiting pulse. The actual shape of the limiting pulse is not really important, pro- 
vided that the area contained within it remains constant as its duration approaches zero. 
Physically, therefore, the unit impulse function at t = a may equally well be regarded 
as the pulse $,(t) of Figure 5.35 in the limiting case as T approaches zero. 

In some applications we need to consider a unit impulse function at time ѓ = 0. This 
is denoted by 6(t) and is defined as the limiting case of the pulse @,(t) illustrated in 
Figure 5.36 as T approaches zero. It has the properties 


ó(t) 2-0 (t#0) 


| (0) 4 = 1 


The sifting property 


An important property of the unit impulse function that is of practical significance is 
the so-called sifting property, which states that if f(t) is continuous at t = a then 


оо 


| f(0)8(t - a) dt 2 f(a) (5.48) 


This is referred to as the sifting property because it provides a method of isolating, or 
sifting out, the value of a function at any particular point. 

For theoretical reasons it is convenient to use infinite limits in (5.48), while in reality 
finite limits can be substituted. This follows since for æ < a < p, where о апа В аге 
constants, 


B 
| f(t) 6(t- a) dt = f(a) (5.49) 


a 


5.5.10 
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For example, 


2n 
| cos tó(t - ix) dt 2 cosin - i 
0 


Laplace transforms of impulse functions 


By the definition of the Laplace transform, we have for any a > 0 


оо 


| ó(t - a)e" dt 


0 


which, using the sifting property, gives the important result 

JZ(ó(t-a))-e*^ (5.50) 
or, in terms of the inverse transform, 

Lhe = ей) (5.51) 


As mentioned earlier, in many applications we may have an impulse function 6(f) at 
t — 0, and it is in order to handle such a function that we must carefully specify whether 
the lower limit in the Laplace integral defined in Section 5.2.1 is 0^ or 0*. Adopting the 
notation 


оо 


f(t) e" dt 


0 


ФАО} = | 


LASO} = | шей 


we have 


0* e 


£go | socra | fA e™ dt 


0 


If f(t) does not involve an impulse function at t — 0 then clearly LAO} = LAS}. 
However, if f(t) does involve an impulse function at t= 0 then 


| f(t) dt #0 


and it follows that 


AU + UG) 
In Section 5.2.1 we adopted the definition 


ZU) - X UG 
so that (5.50) and (5.51) hold for a = 0, giving 


оо 


zin | O(the“dt=e" =1 


0 
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Example 5.45 


Solution 


so that 
JZ(ó(t)) 21 
or, in inverse form, 


£j -9(0 


This transform can be implemented in MATLAB using the sequence of commands 


Sans S ic 
cle sva Direc lei р 
laplace (del) 


Likewise for (5.50); for example, if a — 2 then the Laplace transform of ó(t — 2) is 


generated by the commands 
del2-sym('Dirac(t-2)"); 
laplace(del2) 

or directly using the command 
laplace(sym(^'Dirac(t-2)")) 

giving the answer exp (-2*s) in each case. 

In MAPLE the commands 

with(inttrans): 
атша еа ае) 


return the answer e”. 





2 
Determine Z“ = ; 
s +4 
Since 


-5 2544-4. ,. 4 
5+4 5+4 s +4 


we have 


2 
s +4 s +4 


giving 





2 


£z 5 |o 90-2502 


s +4 





Е) 


Example 5.46 


Solution 
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In MATLAB this is obtained directly, with the commands 


ileplece(s^27/i(89^294)) 5 
pretty (ans) 


generating the answer 
ааа) 25 ааа 
The answers may also be obtained in MAPLE using the commands 


with(inttrans): 
ДМ Кайа ше Бо so? PESE 


Determine the solution of the differential equation 


ах ах " 
+32 +2х = 1+ 002-4) (5.54) 
dé = dt 


subject to the initial conditions x(0) = x(0) = 0. 


Taking Laplace transforms in (5.54) gives 
[s?X(s) — sx(0) — x(0)] - 3[sX(s) — x(0)] - 2X(s) 2 {1} + V (ó(t — 4) 


which, on incorporating the given initial conditions and using (5.50), leads to 
($4354 2)X() - 1 e^ 
s 
giving 


1 -4s 1 


MO Oy ee 


Resolving into partial fractions, we have 
1 1 2 -s( 1 1 
X(s)) 25|^4 — -— |+ Е = ) 
it 5+2 ex j stl s+2 


which, on taking inverse transforms and using the result (5.45), gives the required 
response: 





x(t) = (1 te?7—265) 4 (et? — e?9)p(r— 4) 
or, in an alternative form, 
(14e -2e7) (0x t « 4) 
x(t = 4 -t 8 -2t 
;t(e'-De'-(e-2e" (t4) 


We note that, although the response x(f) is continuous at / — 4, the consequence of the 
impulsive input at f= 4 is a step change in the derivative x(f). 
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5,5.11 


As was the case in Example 5.40, when considering Heaviside functions as forcing 
terms, it seems that the dsolve command in MATLAB cannot be used directly in 
this case. Using the maple command the following commands: 

пареа ес ае) Ес) srl efe ec (rte patentee (ten) 

See lr Where Cie.) 0) 

ans= 

бё а= бш (Бе (Еу, "Б (ЕЭ авва) 2 (19) 

= 1+Dirac(t-4) 

maple(‘dsolve({de,x(0)=0,D(x)(0)=0},x(t)), 

method-laplace;') 
output the required answer: 


x(t) =1/2-exp (-t)+1/2* exp (-2*t) -Heaviside(t—4) * 
exp (-2*t+8) +Heaviside (t-4) *exp(-t+4) 


Relationship between Heaviside step and 
impulse functions 


From the definitions of H(t) and 6(f), it can be argued that 
H(t) -Í ó(1) dT (5.55) 


since the interval of integration contains zero if ¢ > 0 but not if t < 0. Conversely, 
(5.55) may be written as 


SS E H(t) = H(t) (5.56) 


which expresses the fact that H’(t) is zero everywhere except at t = 0, when the jump 
in A(t) occurs. 

While this argument may suffice in practice, since we are dealing with generalized 
functions a more formal proof requires the development of some properties of gener- 
alized functions. In particular, we need to define what is meant by saying that two 
generalized functions are equivalent. 

One method of approach is to use the concept of a test function @(t), which is a 
continuous function that has continuous derivatives of all orders and that is zero outside 
a finite interval. One class of testing function, adopted by R. R. Gabel and R. A. Roberts 
(Signals and Linear Systems, Wiley, New York, 1973), is 


-didt 
ZORERE (|t| <d), where d = constant 
0 otherwise 


For a generalized function g(t) the integral 
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G(0) =| O(t)g(t) dt 


is evaluated. This integral assigns the number G(0) to each function @(t), so that G(0) 
is a generalization of the concept of a function: it is a linear functional on the space of 
test functions 0(t). For example, if g(t) = 6(f) then 


G(0) -| 6(t) (1) 4 = Ө(0) 


so that in this particular case, for each weighting function 0(f), the value Ө(0) is 
assigned to G(0). 

We can now use the concept of a test function to define what is meant by saying that 
two generalized functions are equivalent or ‘equal’. 


Definition 5.2: The equivalence property 


If g,(t) and g,(t) are two generalized functions then g,(t) = g,(t) if and only if 


| A(t) gi(t) dt = | A(t) g(t) dt 
for all test functions O(t) for which the integrals exist. 


The test function may be regarded as a ‘device’ for examining the generalized func- 
tion. Gabel and Roberts draw a rough parallel with the role of using the output of a 
measuring instrument to deduce properties about what is being measured. In such an 
analogy g,( = g(f if the measuring instrument can detect no differences between 
them. 

Using the concept of a test function 0(f), the Dirac delta function ó(f) may be 
defined in the generalized form 


| AA dt = 0(0) 


Interpreted as an ordinary integral, this has no meaning. The integral and the function 
Ô(t) are merely defined by the number 0(0). In this sense we can handle 6(f) as if it 
were an ordinary function, except that we never talk about the value of ô(t); rather we 
talk about the value of integrals involving ô(t). 
Using the equivalence property, we can now confirm the result (5.56), namely that 
6(t) = i н) = Н 


To prove this, we must show that 
| 6(t)ó(t) dt — | (0) H'(t) dt (5.57) 


Integrating the right-hand side of (5.57) by parts, we have 
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Figure 5.37 
Piecewise-continuous 
function with jump 
discontinuities. 


оо 


| донда = идео. = | H(t)0' (t) dt 


-оо 


= 0- | 0'(t)dt (by the definitions of 0(t) and H(t)) 


= -[6(0]o - 6(0) 


Since the left-hand side of (5.57) is also 6(0), the equivalence of 6(t) and H(t) is proved. 
Likewise, it can be shown that 


OG Ne С О) (5.58) 


The results (5.56) and (5.58) may be used to obtain the generalized derivatives of 
piecewise-continuous functions having jump discontinuities d,, d,, ..., d, at times 
ti, bh, ..., 4, respectively, as illustrated in Figure 5.37. On expressing f(t) in terms of 
Heaviside step functions as in Section 5.5.1, and differentiating using the product rule, 
use of (5.56) and (5.58) leads to the result 


f= 2+ ¥ 450-1) (5.59) 


і=1 


where g’(t) denotes the ordinary derivative of f(t) where it exists. The result (5.59) tells 
us that the derivative of a piecewise-continuous function with jump discontinuities 
is the ordinary derivative where it exists plus the sum of delta functions at the discon- 
tinuities multiplied by the magnitudes of the respective jumps. 





By the magnitude d; of a jump in a function f(t) at a point t, we mean the difference 
between the right-hand and left-hand limits of f(t) at t; that is, 


d, — f (t, 0)—f(t, — 0) 


It follows that an upward jump, such as d, and d, in Figure 5.37, is positive, while a 
downward jump, such as d; in Figure 5.37, is negative. 

The result (5.59) gives an indication as to why the use of differentiators in practical 
systems is not encouraged, since the introduction of impulses means that derivatives 
increase noise levels in signal reception. In contrast, integrators have a smoothing effect 
on signals, and are widely used. 
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Example 5.47 . Obtain the generalized derivative of the piecewise-continuous function 


2?+1 (0=т<3) 
Д) =ү+4_ (3<=т<5) 
4 (t2 5) 


Figure 5.38 Piecewise- f(t) 
continuous function of 
Example 5.47. 





Solution f(t) is depicted graphically in Figure 5.38, and it has jump discontinuities of magni- 
tudes 1, -12 and —5 at times ¢ = 0, 3 and 5 respectively. Using (5.59), the generalized 
derivative is 


f'(t) 2 g(t) - 1é(t) - 126(t — 3) - 5ó(t — 5) 
where 
4t (0xrt«3) 
g(t) 41 (3<t<5) 
0 (t=5) 


Example 5.48 A system is characterized by the differential equation model 


2 
ах 59.65 =и+34 (5.60) 


df dt dt 


Determine the response of the system to a forcing function u(t) = e“ applied at time 
t = 0, given that it was initially in a quiescent state. 


Solution Since the system is initially in a quiescent state, the transformed equation correspond- 
ing to (5.60) is 


(° + 5s - 6)X(s) 2 (3s - D)U(s) 
giving 


X(s) - Js +1 


8 +55 +6 


U(s) 


In the particular case when u(t) = e“, U(s) = 1/(s + 1), so that 


3s +1 -1 5 4 
сз) aa 
ау sxl 342 £43 
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which, on taking inverse transforms, gives the desired response as 
x(t) 2—-e'-5e?-4e? (rz 0) 
One might have been tempted to adopt a different approach and substitute for u(t) 


directly in (5.60) before taking Laplace transforms. This leads to 


5— + 6x = e” — 3 e” = —2 e” 


dx | dx 
dê dt 


which, on taking Laplace transforms, leads to 


2 
245 + 6)X e a 
s Tone 5 +1 


giving 


-2 -1 2 1 
XI = — Е 
Dy ЕВ stl 942 943 
which, on inversion, gives 
x(t)=-e'+2e%-e* (¢ 20) 


Clearly this approach results in a different solution, and therefore appears to lead to a 
paradox. However, this apparent paradox can be resolved by noting that the second 
approach is erroneous in that it ignores the important fact that we are dealing with 
causal functions. Strictly speaking, 


u(t) = e ‘H(t) 


and, when determining duw/dt, the product rule of differential calculus should be 
employed, giving 


du_ =t ad 
s e‘H(t)+e 3; HG) 


--—e'H(t)- e'ó(t) 


Substituting this into (5.60) and taking Laplace transforms gives 





2 1 1 3s+1 
+5s+6)X(s) = —— +3 |- —– +1 | = 
e ттан stl ( 5+1 ) stl 


That is, 


X(s) = 3531 
(5+ 1)(5 + 55+ 6) 


leading to the same response 
x(t)=—-e'+5e%-4e* (¢=0) 


as in the first approach above. 


25 


26 
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The differential equation used in Example 5.48 is of a form that occurs frequently in 
practice, so it is important that the causal nature of the forcing term be recognized. 
The derivative 6’(t) of the impulse function is also a generalized function, and, using 


the equivalence property, it is readily shown that 


оо 


or, more generally, 


оо 


| FASA) dt = -/'(0) 


| JA- a) dt = -f (a) 


provided that f’(t) is continuous at t = a. 


Likewise, the nth derivative satisfies 


| SOS- a) dt = (-1)"f” (a) 


provided that f(r) is continuous at f= a. 
Using the definition of the Laplace transform, it follows that 


LE a)} = 5" 


and, in particular, 


HENN} =s” 


5.5.12 Exercises 


(5.61) 


Check your answers using MATLAB or MAPLE whenever possible. 


2 
Obtain the inverse Laplace transforms of the (c) dx +7 dx + 12x = б(1-3) 
following: dt dt 
25° +1 52-1 5° +2 : dx 
at 5 e Tes subject tox = 1 and — =1 atr=0 
@ Geayses) уз.  Farses di 


Solve for t = 0 the following differential equations, 
subject to the specified initial conditions: 


eJ 


Obtain the generalized derivatives of the following 
piecewise-continuous functions: 


(a) x (705,12, 224 (1-22) А 
dt dt 3t (0 t« 4) 
| dx (а) 700) =121-3 (4<t<6) 
subjecto x mO and ee 5 (t 2 6) 
) #®+6% +13 = &(r- 22) Do gp 
i * (b g(-242-t (1«t«2) 
subject to x= 0 and 5 - Q at 1-0 0 (t 22) 
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28 


29 


Figure 5.39 
Transverse deflection 
of a beam: (a) initial 
position; (b) displaced 
position. 


2t+5 (0<St< 2) 
9—37 (2 
f-t 


(c) f(t) = 


Solve for t 7 0 the differential equation 


2 
dx 78 4 10, 2,4 5t 
dt dt dt 
subject to x = 0 and dx/dt = 2 at t= 0 and where 
u(t) =e “H(t). 


A periodic function f(t) is an infinite train of unit 30 
impulses at ¢ = 0 and repeated at intervals of t — T. 
Show that 


Li f()} = 





1-e77 


The response of a harmonic oscillator to such a periodic 
stimulus is determined by the differential equation 


5.5.13 Bending of beams 


dx Den = 
s+ to@x=f(t) (t0) 
dt 


Show that 
x(t) = l X Ha- nT)sin@(t-nT) (t 20) 
© п=0 


and sketch the responses from t — 0 to t — 67/0 for 
the two cases (a) T 2 1/0 and (b) Т = 27/0. 


An impulse voltage EóÓ(f) is applied at time t — 0 
to a circuit consisting of a resistor R, a capacitor 
C and an inductor L connected in series. Prior to 
application of this voltage, both the charge on 

the capacitor and the resulting current in the 
circuit are zero. Determine the charge q(t) on the 
capacitor and the resulting current i(t) in the circuit 
at time t. 


So far, we have considered examples in which Laplace transform methods have been 
used to solve initial-value-type problems. These methods may also be used to solve 
boundary-value problems, and, to illustrate, we consider in this section the application 
of Laplace transform methods to determine the transverse deflection of a uniform thin 


beam due to loading. 


Consider a thin uniform beam of length / and let y(x) be its transverse displacement, 
at distance x measured from one end, from the original position due to loading. The 
situation is illustrated in Figure 5.39, with the displacement measured upwards. Then, 
from the elementary theory of beams, we have 


4 


Ere c US 


(5.62) 


where W(x) is the transverse force per unit length, with a downwards force taken to be 
positive, and £7 is the flexural rigidity of the beam (£ is Young's modulus of elasticity 
and / is the moment of inertia of the beam about its central axis). It is assumed that the 
beam has uniform elastic properties and a uniform cross-section over its length, so that 


both £ and / are taken to be constants. 


| У 
о lox о 


(а) 


W(x) 
(b) 


5.5 STEP AND IMPULSE FUNCTIONS 425 


Equation (5.62) is sometimes written as 


4 
Er = Wx) 
dx 

where y(x) is the transverse displacement measured downwards and not upwards as 
in (5.62). 

In cases when the loading is uniform along the full length of the beam, that is 
W(x) = constant, (5.62) may be readily solved by the normal techniques of integral 
calculus. However, when the loading is non-uniform, the use of Laplace transform 
methods has a distinct advantage, since by making use of Heaviside unit functions and 
impulse functions, the problem of solving (5.62) independently for various sections of 
the beam may be avoided. 

Taking Laplace transforms throughout in (5.62) gives 


EI [s*Y(s) — s*y(0) — s^y,(0) — sy4(0) — y4(0)] = —W(s) (5.63) 
where 
_ (dy _ (Фу (Фу 
»(0) s y2(0) E 0) =( |, 


and may be interpreted physically as follows: 


Ely,(0) is the shear at x = 0 

Ely (0) is the bending moment at x = 0 
y,(0) is the slope at x = 0 
у(0) is the deflection at x = 0 


Solving (5.63) for y(s) leads to 
TEROR OMOR OR O са 
5 s 


5 5° $ 
Thus four boundary conditions need to be found, and ideally they should be the shear, 
bending moment, slope and deflection at x = 0. However, in practice these boundary 
conditions are not often available. While some of them are known, other boundary con- 
ditions are specified at points along the beam other than at x = 0, for example conditions 
at the far end, x = Z, or conditions at possible points of support along the beam. That is, 
we are faced with a boundary-value problem rather than an initial-value problem. 

To proceed, known conditions at x = 0 are inserted, while the other conditions among 
у(0), »,(0), y,(0) and y4(0) that are not specified are carried forward as undetermined 
constants. Inverse transforms are taken throughout in (5.45) to obtain the deflection 
y(x), and the outstanding undetermined constants are obtained using the boundary con- 
ditions specified at points along the beam other than at x = 0. 

The boundary conditions are usually embodied in physical conditions such as the 
following: 


(a) The beam is freely, or simply, supported at both ends, indicating that both the 
bending moments and deflection are zero at both ends, so that y = d’y/dx* = 0 at 
both x 2 0 and x =/. 

(b) At both ends the beam is clamped, or built into a wall. Thus the beam is horizontal 
at both ends, so that y = dy/dx = 0 at both x = 0 and x = l. 
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Example 5.49 


Figure 5.40 
Loaded beam of 
Example 5.49. 


Solution 


(c) The beam is a cantilever with one end free (that is, fixed horizontally at one end, 
with the other end free). At the fixed end (say x = 0) 


j= VG atx =0 
dx 


and at the free end (x =/), since both the shearing force and bending moment are zero, 


2 3 
ЧУЧУ 9 atx=/ 
dx dx 
If the load is not uniform along the full length of the beam, use is made of Heaviside 
step functions and impulse functions in specifying W(x) in (5.62). For example, a 
uniform load w per unit length over the portion of the beam x = x, to x = x, is specified 
as wH(x — xi) — wH(x — x;), and a point load w at x — x, is specified as wó(x — x). 


Figure 5.40 illustrates a uniform beam of length /, freely supported at both ends, bending 
under uniformly distributed self-weight W and a concentrated point load P at x = 17. 
Determine the transverse deflection y(x) of the beam. 





As in Figure 5.39, the origin is taken at the left-hand end of the beam, and the deflection 
y(x) measured upwards from the horizontal at the level of the supports. The deflection 
y(x) is then given by (5.62), with the force function W(x) having contributions from the 
weight JV, the concentrated load P and the support reactions R, and R,. However, since 
we are interested in solving (5.62) for 0 « x < l, point loads or reactions at the end 
x = l may be omitted from the force function. 

As a preliminary, we need to determine R,. This is done by taking static moments 
about the end x = /, assuming the weight W to be concentrated at the centroid x = 11, 
giving 

Rl = 301+ Р 
Or 
R, =3W+2P 


The force function W(x) may then be expressed as 
W(x) = P. HG) * Pó(x - 1) - (W 2 P)8Q) 
with a Laplace transform 


W(s) = + Pe^^- yp) 
S 
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Since the beam is freely supported at both ends, the deflection and bending moments 
are zero at both ends, so we take the boundary conditions as 


y=0 atx=Oandx=/ 


2 
dy_9 atx=Oandx=/ 


dx 


The transformed equation (5.64) becomes 


El|]? s s? s 


Ү(ѕ) = Га BIN eim 4.2100) | 2300) 
M 


Taking inverse transforms, making use of the second shift theorem (Theorem 5.4), 
gives the deflection y(x) as 


MEN 
y(x)- “ape Pe -iDHx-iD- rede! 
+100) ys (0) 


To obtain the value of the undetermined constants y,(0) and y4(0), we employ the 
unused boundary conditions at x = l, namely y(/) = 0 and y,(/) = 0. For x > 4/ 


у(х) = di Eds EP(x - My - 1G mw! +y,(0)x + 4 y3;(0)x° 


24 1 
d'y 1 | Wx? l iat. OP 
EX = y, (x) = -— FS e pis-i)- (yore 2E je + 0)x 
"ES 1] 21 е з оз кп 


Thus taking y;(/) 7 0 gives y4(0) 2 0, and taking y(/) = 0 gives 
1 
аа WP + РІ? – 10Р -1РІ?) + у1(0)1= 0 
so that 
р 
2100) = “ые nf) 
Substituting back, we find that the deflection y(x) is given by 
W ( x 3 2 P 2 3 P 3 
у(х) = -E(f i +1 х) - £d X-ix )- =р(х- 31) Н(х- 51) 


or, for the two sections of the beam, 


W P 
-E(3- 55 «afs)- Ernie (0 «x « il) 


4 
P 
nba еше бше ixl - iP) dlex«l) 
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5.5.14 Exercises 


31 Find the deflection of a beam simply supported at per unit length, w, over the section x = x, to x =X. 
its ends x = 0 and x =/, bending under a uniformly What is the maximum deflection ifx, = 0 and x, =/? 
distributed self-weight M and a concentrated load 


Watx- i L. 33 A uniform cantilever beam of length / is subjected 

to a concentrated load W at a point distance b from 

32 А cantilever beam of negligible weight and of the fixed end. Determine the deflection of the beam, 

length / is clamped at the end x = 0. Determine the distinguishing between the sections 0 < x < Р апа 
deflection of the beam when it is subjected to a load b<xSil. 


5.6 


5.6.1 


Input System Output 


U X 
ш с 


Figure 5.41 
Transfer function 
block diagram. 


Transfer functions 


Definitions 


The transfer function of a linear time-invariant system is defined to be the ratio of 
the Laplace transform of the system output (or response function) to the Laplace trans- 
form of the system input (or forcing function), under the assumption that all the initial 
conditions are zero (that 1s, the system is initially in a quiescent state). 

Transfer functions are frequently used in engineering to characterize the input— 
output relationships of linear time-invariant systems, and play an important role in the 
analysis and design of such systems. 

Consider a linear time-invariant system characterized by the differential equation 


d”u 


d'x d'x _ 
а, +...+a x = Б, +... + рои (5.65) 
dr" 


n * an-ı 


dt’ dt’! 





where n = m, the as and bs are constant coefficients, and x(f) is the system response or 
output to the input or forcing term u(t) applied at time t 2 0. Taking Laplace transforms 
throughout in (5.65) will lead to the transformed equation. Since all the initial condi- 
tions are assumed to be zero, we see from (5.15) that, in order to obtain the transformed 
equation, we simply replace d/dt by s, giving 


(a,8" + a, 5" t... dy)X(s) = (6, 8" +... + by Us) 


where X(s) and U(s) denote the Laplace transforms of x(f) and u(t) respectively. 
The system transfer function G(s) is then defined to be 


G(s) = d _ 65 +...+Ь, (5.66) 
U(s) Q,S T... do 


with (5.66) being referred to as the transfer function model of the system characterized 
by the differential equation model (5.65). Diagramatically this may be represented by the 
input-output block diagram of Figure 5.41. 

Writing 


Р(5) = Б„5" +... +6, 


Q(s) 2a,s" t... ag 


Example 5.50 
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the transfer function may be expressed as 


es) = 
Os) 


where, in order to make the system physically realizable, the degrees m and n of the 
polynomials P(s) and Q(s) must be such that n = m. This is because it follows from 
(5.61) that if m > n then the system response x(f) to a realistic input u(t) will involve 
impulses. 

The equation Q(s) = 0 is called the characteristic equation of the system; its order 
determines the order of the system, and its roots are referred to as the poles of the 
transfer function. Likewise, the roots of P(s) = 0 are referred to as the zeros of the 
transfer function. 

It is important to realize that, in general, a transfer function is only used to character- 
ize a linear time-invariant system. It is a property of the system itself, and is independent 
of both system input and output. 

Although the transfer function characterizes the dynamics of the system, it provides 
no information concerning the actual physical structure of the system, and in fact sys- 
tems that are physically different may have identical transfer functions; for example, 
the mass-spring-damper system of Figure 5.12 and the LCR circuit of Figure 5.8 both 
have the transfer function 


Ges ee su 
U(s) оғ + В+ у 


In the mass-spring- damper system X(s) determines the displacement x(¢) of the mass 
and U(s) represents the applied force F(t), while œ denotes the mass, 3 the damping 
coefficient and y the spring constant. On the other hand, in the LCR circuit X(s) deter- 
mines the charge q(t) on the condenser and U(s) represents the applied emf e(f), while 
a denotes the inductance, f the resistance and y the reciprocal of the capacitance. 

In practice, an overall system may be made up of a number of components each 
characterized by its own transfer function and related operation box. The overall system 
input—output transfer function is then obtained by the rules of block diagram algebra. 

Since G(s) may be written as 


b,(s-z)(s-z)...(s-z,) 
G = а 
= (= Gen). Gee 


where the z;s and p;s are the transfer function zeros and poles respectively, we observe 
that G(s) is known, apart from a constant factor, if the positions of all the poles and 
zeros are known. Consequently, a plot of the poles and zeros of G(s) is often used as 
an aid in the graphical analysis of the transfer function (a common convention is to 
mark the position of a zero by a circle O and that of a pole by a cross x). Since the 
coefficients of the polynomials P(s) and Q(s) are real, all complex roots always occur in 
complex conjugate pairs, so that the pole—zero plot is symmetrical about the real axis. 


The response x(f) of a system to a forcing function u(f) is determined by the differential 
equation 
2 
ойх Hae ot ay 
dt dt dt 
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Solution 


Figure 5.42 
Pole (x)—zero (O) plot 
for Example 5.50. 


(a) 
(b) 


(c) 


(a) 


(b) 


(c) 


Determine the transfer function characterizing the system. 

Write down the characteristic equation of the system. What is the order of the 
system? 

Determine the transfer function poles and zeros, and illustrate them diagram- 
matically in the s plane. 


Assuming all the initial conditions to be zero, taking Laplace transforms throughout 
in the differential equation 


2 
9C , 129% 4 13x = 24H 
dt dt dt 


+3u 


leads to 
(9s? + 12s + 13)X(s) = (25 + 3)0(5) 
so that the system transfer function is given by 
X(s 2s+3 
aa 00) 952+ 128+ 13 
The characteristic equation of the system is 
952 + 125 + 13 = 0 
апа the system is of order 2. 
The transfer function poles are the roots of the characteristic equation 


952 + 125+ 13 = 0 


which are 
aris у(144 – 468) _ —2 + j3 


18 3 


That is, the transfer function has simple poles at 
s=-24+j and s--i-j 


The transfer function zeros are determined by equating the numerator polynomial 
2s +3 to zero, giving a single zero at 


The corresponding pole-zero plot in the s plane is shown in Figure 5.42. 


Im(s) 





5.6.2 
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A transfer function (tf) is implemented within MATLAB using the commands 


Eg oe mite) 
G - G(s) 


Thus, entering G=(2*s+3) /(9*s*2+12*s+13) generates 


28 + 3 


transfer function = а nee 
Oc MEI C NIS 


The command poly (G) generates the characteristic polynomial, whilst the commands 
pole(G) and zero(G) generate the poles and zeros respectively. The command 
pzmap (G) draws the pole—zero map. 


Stability 


The stability of a system is a property of vital importance to engineers. Intuitively, we 
may regard a stable system as one that will remain at rest unless it is excited by an 
external source, and will return to rest if all such external influences are removed. Thus 
a stable system is one whose response, in the absence of an input, will approach zero 
as time approaches infinity. This then ensures that any bounded input produces a 
bounded output; this property is frequently taken to be the definition of a stable linear 
system. 

Clearly, stability is a property of the system itself, and does not depend on the 
system input or forcing function. Since a system may be characterized in the s domain 
by its transfer function G(s), it should be possible to use the transfer function to specify 
conditions for the system to be stable. 

In considering the time response of 


X6) = GOUE), G(s) = E 
Q(s) 


to any given input u(t), it is necessary to factorize the denominator polynomial 
Q(s) 2 a,s" - a, ,5" t... ag 


and various forms of factors can be involved. 


Simple factor of the form s 4 &, with o real 


This corresponds to a simple pole at s = —a, and will in the partial-fractions expansion 
of G(s) lead to a term of the form c/(s + œ) having corresponding time response 
ce H(t), using the strict form of the inverse given in (5.12). If @ > 0, so that the pole 
is in the left half of the s plane, the time response will tend to zero as t — ee. Ifa < 0, 
so that the pole is in the right half of the s plane, the time response will increase without 
bound as t > œ. It follows that a stable system must have real-valued simple poles of 
G(s) in the left half of the s plane. 

a = 0 corresponds to a simple pole at the origin, having a corresponding time 
response that is a step cH(t). A system having such a pole is said to be marginally 
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stable; this does not ensure that a bounded input will lead to a bounded output, since, 
for example, if such a system has an input that is a step d applied at time ¢ = 0 then the 
response will be a ramp cdtH(t), which is unbounded as t — œ. 


Repeated simple factors of the form (s + a)", with o real 


This corresponds to a multiple pole at s = —a, and will lead in the partial-fractions 
expansion of G(s) to a term of the form c/(s + a)" having corresponding time response 
[c/(n — 1)!]t^! e "H(f). Again the response will decay to zero as t — ee only if o 7 0, 
indicating that a stable system must have all real-valued repeated poles of G(s) in the 
left half of the s plane. 


Quadratic factors of the form (s + œ} + P’, with œ and f real 


This corresponds to a pair of complex conjugate poles at s 2 —o 4 jD, s - —oe — j, and 
will lead in the partial-fractions expansion of G(s) to a term of the form 


c(s-- o) * dB 


(5+ 0) + В 
having corresponding time response 
e "(c cos Bt - dsin Dt) 2 A e™™ sin (Bt + y) 


where А = (c? - d?) and y- tan !(c/d ). 

Again we see that poles in the left half of the s plane (corresponding to o — 0) have 
corresponding time responses that die away, in the form of an exponentially damped 
sinusoid, as / — «e. A stable system must therefore have complex conjugate poles 
located in the left half of the s plane; that is, all complex poles must have a negative 
real part. 

If œ = 0, the corresponding time response will be a periodic sinusoid, which will not 
die away as ¢ > oe. Again this corresponds to a marginally stable system, and will, for 
example, give rise to a response that increases without bound as t — ee when the input 
is a sinusoid at the same frequency f. 

A summary of the responses corresponding to the various types of poles is given in 
Figure 5.43. 

The concept of stability may be expressed in the form of Definition 5.3. 


Definition 5.3 


A physically realizable causal time-invariant linear system with transfer function 
G(s) 1s stable provided that all the poles of G(s) are in the left half of the s plane. 


The requirement in the definition that the system be physically realizable, that is n > m 
in the transfer function G(s) of (5.66), avoids terms of the form s”™” in the partial- 
fractions expansion of G(s). Such a term would correspond to differentiation of degree 
m — n, and were an input such as sin wt used to excite the system then the response 
would include a term such as @”™” sin wt or (0"" cos ot, which could be made as large 
as desired by increasing the input frequency 0. 


Figure 5.43 


Relationship between 
transfer function poles 


and time response. 
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Poles of G(s) in Poles in complex Corresponding Nature of response 
form 9 * jo s plane time response 


o=w=0 Constant 
o=w=0 
(multiplicity 2) Ramp 
деа» oe 
ecay 
sx 550 Exponential 
growth 
a=0,w>0 Sinusoidal 
кашый о 
(multiplicity 2) sinusoidal 


Exponentially 


a<0,@>0 decaying 
sinusoidal 
Exponentially 

a>0,w>0 growing 
sinusoidal 





In terms of the poles of the transfer function G(s), its abscissa of convergence О, 
corresponds to the real part of the pole located furthest to the right in the s plane. For 
example, if 


5+1 


OS) = Fayed) 


then the abscissa of convergence o, = —2. 
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Example 5.51 


Solution 


It follows from Definition 5.3 that the transfer function G(s) of a stable system has an 
abscissa of convergence o, — —o, with o — 0. Thus its region of convergence includes 
the imaginary axis, so that G(s) exists when s = j@. We shall return to this result when 
considering the relationship between Laplace and Fourier transforms in Section 8.4.1. 

According to Definition 5.3, in order to prove stability, we need to show that all the 
roots of the characteristic equation 


O(s) = 4,8" + а, 15" +... +аѕ+а = 0 (5.67) 


have negative real parts (that 1s, they lie in the left half of the s plane). Various criteria 
exist to show that all the roots satisfy this requirement, and it 1s not necessary to solve 
the equation to prove stability. One widely used criterion is the Routh-Hurwitz criterion, 
which can be stated as follows: 


A necessary and sufficient condition for all the roots of equation (5.67) 
to have negative real parts is that the determinants A,, ^,,..., ^, are 
all positive, where 


CIPIT Oh, 0 0 TS 0 
а, з UNES (л @ "T 0 
5.68 
A, = а,-5 а,-4 ал-3 s ка 0 ( ) 
Ü, (27-1) An-2r dn-2r-1 An-2r-2 S sy 


it being understood that in each determinant all the as with subscripts 
that are either negative or greater than n are to be replaced by zero. 


Show that the roots of the characteristic equation 
54+ 95° + 3352 + 515 + 26= 0 


all have negative real parts. 








In this case n = 4, dy = 26, a, = 51, a, = 33, a, = 9, a, = 1 and a,=0 (r > 4). The 
determinants of the Routh-Hurwitz criterion are 


A, = |4,-1] = |a3|=|9|=9 > 0 














А, = їл а _ |а da 
An-3 An-2 а а 
9 1 
= = 246>0 
51 33 
Qn-1 a, 0 a3 ал 0 
Аз = |а,3 а: а1| = |а а а 


4,5 а, 4 @„—3 а а, а, 


Example 5.52 


Solution 
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an-ı a, 0 0 а» ад 0 0 

А.Е An-3 n2 Anai а|_|а а аз а 
An-5 An-4 An-3 An-2 ат 94 а а 
An-7 4,6 8-5 na аз аз а 4p 
9 1 0 0 
1 

_ 5 33 9 1 = 26A, > 0 

0 26 51 37 
0 0 0 26 


Thus A, > 0, A, > 0, A; > 0 and A, > 0, so that all the roots of the given characteristic 
equation have negative real parts. This is readily checked, since the roots are –2, –1, 
—3 + j2 and —3 - j2. 


The steady motion of a steam-engine governor is modelled by the differential equations 
mij + by + dn-ew=0 
hò = -fn 

where 1] is a small fluctuation in the angle of inclination, @ a small fluctuation in the 


angular velocity of rotation, and m, b, d, e, f and J are all positive constants. Show that 
the motion of the governor is stable provided that 


bd . ef 


m Io 


(5.69) 
(5.70) 


Differentiating (5.69) gives 
mij + bij - dr) - eo -0 
which, on using (5.70), leads to 
mij + bij + а + Ly =0 
0 
for which the corresponding characteristic equation is 
ms? + bs? + ds + efo 
0 


This is a cubic polynomial, so the parameters of (5.67) are 


n=3, a=, a-d, mzb, =m (a,-0, r3) 


0 


The determinants (5.68) of the Routh-Hurwitz criterion are 


A, =|a,|=5 > 0 


А, = db UR = ? T = da- net 
a a е а І, 
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5.6.3 


(and so A, > 0 provided that bd — mef/I, — 0 or bd/m > ef/Ij), and 


a a, 0 
Аз=|а, а, а›|=аА›> 0 ifA,>0 


0 0 ао 
Thus the action of the governor is stable provided that A, > 0; that is, 
bd. ef 
m Io 


Impulse response 


From (5.66), we find that for a system having transfer function G(s) the response x(t) 
of the system, initially in a quiescent state, to an input u(t) is determined by the 
transformed relationship 


X(s) = G(s) U(s) 


If the input u(t) is taken to be the unit impulse function 6(f) then the system response 
will be determined by 


X(s) = G(s) L{5()} = G(s) 


Taking inverse Laplace transforms leads to the corresponding time response A(t), which 
is called the impulse response of the system (it is also sometimes referred to as the 
weighting function of the system); that is, the impulse response is given by 


h(t) = LHX) = LN G(s)} (5.71) 
We therefore have the following definition. 


Definition 5.4: Impulse response 


The impulse response A(t) of a linear time-invariant system is the response of the 
system to a unit impulse applied at time / = 0 when all the initial conditions are zero. 
It is such that L{h(t)} = G(s), where G(s) is the system transfer function. 


Since the impulse response is the inverse Laplace transform of the transfer function, 
it follows that both the impulse response and the transfer function carry the same informa- 
tion about the dynamics of a linear time-invariant system. Theoretically, therefore, it is 
possible to determine the complete information about the system by exciting it with an 
impulse and measuring the response. For this reason, it is common practice in engineering 
to regard the transfer function as being the Laplace transform of the impulse response, 
since this places greater emphasis on the parameters of the system when considering 
system design. 

We saw in Section 5.6.2 that, since the transfer function G(s) completely characterizes 
a linear time-invariant system, it can be used to specify conditions for system stability, 
which are that all the poles of G(s) lie in the left half of the s plane. Alternatively, 
characterizing the system by its impulse response, we can say that the system is stable 
provided that its impulse response decays to zero as t > ©, 


Example 5.53 


Solution 


5.6.4 


Theorem 5.6 
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Determine the impulse response of the linear system whose response x(f) to an input 
u(t) is determined by the differential equation 


Фх „ах _ 
2 +52 + 6х = 5и(/) (5.72) 
df dt 


The impulse response A(f) is the system response to u(t) — ó(f) when all the initial 
conditions are zero. It is therefore determined as the solution of the differential equation 


dR sdh, op = 56(t) (5.73) 
dt dt 

subject to the initial conditions A(0) = h(0) = 0. Taking Laplace transforms in (5.73) gives 
(s? - 5s - 6)H(s) 2 5.Z(6(t)) - 5 

so that 


H(s) 5 5 5 


(G*3)G*2) э+2 543 
which, on inversion, gives the desired impulse response 
h(t) 2 5(e?! — e 
Alternatively, the transfer function G(s) of the system determined by (5.72) is 
——$ 
5+ 55+6 
so that h(t) = ! (G(s)) — 5(e ?'— e?*) as before. 





G(s) = 


Note: This example serves to illustrate the necessity for incorporating 0" as the lower 
limit in the Laplace transform integral, in order to accommodate for an impulse applied 
at t= 0. The effect of the impulse is to cause a step change in x(f) at f= 0, with the initial 
condition accounting for what happens up to 07. 


In MATLAB a plot of the impulse response is obtained using the commands 
E шс 
G=G(s) 
impulse (G) 


Initial- and final-value theorems 


The initial- and final-value theorems are two useful theorems that enable us to predict 
system behaviour as t > 0 and t > © without actually inverting Laplace transforms. 


The initial-value theorem 


If f(t) and f(t) are both Laplace-transformable and if lim sF(s) exists then 


limf) = f(0") = lim sF(s) 


438 LAPLACE TRANSFORMS 


Proof 


Example 5.54 


From (5.13), 


оо 


Lf} = |  f(Qe*dr-sF(s) - f(0) 


where we have highlighted the fact that the lower limit is 0. Hence 


со 


lim [sF(s) - /(0)] 7 in | ft) e* dt 


0+ оо 
= in | Г) е" а + lim | foe" dt (5.74) 
S—oo 07 $— 99 ot 
If f(2) is discontinuous at the origin, so that f(0*) # f(0), then, from (5.59), f(t) contains 
an impulse term [ f(0*) — f(0-)]6(d, so that 
0+ 
in| (0 е" а = 50°) – 7007) 
S—oo 07 
Also, since the Laplace transform of f(t) exists, it is of exponential order and we have 
in | foe" dt=0 
Soo ot 
so that (5.74) becomes 
lim sF(s) — (0) =/(0") — (0°) 
giving the required result: 
lim sF(s) = (0°) 


If f(t) is continuous at the origin then f£) does not contain an impulse term, and the 
right-hand side of (5.74) is zero, giving 


lim sF(s) 2 f(07) = f(07) 
end of theorem 


It is important to recognize that the initial-value theorem does not give the initial 
value f(0-) used when determining the Laplace transform, but rather gives the value of 


f(t) as t — 0°. This distinction is highlighted in the following example. 


The circuit of Figure 5.44 consists of a resistance R and a capacitance C connected in 
series together with constant voltage source Æ. Prior to closing the switch at time ѓ = 0, 
both the charge on the capacitor and the resulting current in the circuit are zero. Deter- 
mine the current i(f) in the circuit at time f after the switch is closed, and investigate the 
use of the initial-value theorem. 


Solution 
t=0 R 
E y l. 
ДО 
Figure 5.44 


RC circuit of 
Example 5.54. 


Theorem 5.7 


Proof 
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Applying Kirchhoff’s law to the circuit of Figure 5.44, we have 
Ri+ і | ій = Е, 


which, on taking Laplace transforms, gives the transformed equation 


Азу 110). Ж 
e 38 S 


Therefore 


ER 
I(s)2 ——— 
(5) 7 S3 L/RC 


Taking inverse transforms gives the current i(f) at t = 0 as 
i(t) = Е p (5.75) 


Applying the initial-value theorem, 


E _ sE/R è — E/R Е, 
D= limsi(s) = lim S. = lim ——2—— = 22 
keta = hee in Re РЕТТИ 
That is, 
E 
i(Q*) 2 Z9 
i(07) R 


a result that is readily confirmed by allowing t — 0* in (5.75). We note that this is 
not the same as the initial state i(0) = 0 owing to the fact that there is a step change in 
i(t) at t — 0. 


The final-value theorem 


If f(t) and f(t) are both Laplace-transformable and lim f(t) exists then 
t— oo 


limf (£) = lim sF(s) 
їо 50 


From (5.13), 


FFO} = | Ре" а= 5Р5) – 7007) 
Taking limits, we have 


оо оо 


fe" dt = | S'O dt = LFO 


0- 


lim [sF(s) – 7007) = lim | 


o- 


= lim fO -0 
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Example 5.55 


Solution 


giving the required result: 
lim f() — lim sF(s) 
too s 
end of theorem 


The restriction that lim f(t) must exist means that the theorem does not hold for func- 


tions such as e', which tends to infinity as t — ee, or sin ot, whose limit is undefined. 
Since in practice the final-value theorem is used to obtain the behaviour of f(f) as t — ee 
from knowledge of the transform F(s), it is more common to express the restriction in 
terms of restrictions on F(s), which are that sF(s) must have all its poles in the left half 
of the s plane; that 1s, sF(s) must represent a stable transfer function. It is important that 
the theorem be used with caution and that this restriction be fully recognized, since the 
existence of lim sF(s) does not imply that f(f) has a limiting value as t — оо. 
s> 


Investigate the application of the final-value theorem to the transfer function 


1 


кше = 


(5.76) 


lim sF(s) 4 lim —————— - 0 
шерт Od 


so the use of the final-value theorem implies that for the time function f(t) corresponding 
to F(s) we have 


lim ХХ) = 0 


However, taking inverse transforms in (5.76) gives 
f» Me - e?) 
S 


implying that (f) tends to infinity as t — ee. This implied contradiction arises since the 
theorem is not valid in this case. Although lim sF(s) exists, sF(s) has a pole at s = 3, 
which is not in the left half of the s plane. `? 


The final-value theorem provides a useful vehicle for determining a system's steady- 
state gain (SSG) and the steady-state errors, or offsets, in feedback control systems, 
both of which are important features in control system design. 

The SSG ofa stable system is the system's steady-state response, that is the response 
as t > ce, to a unit step input. For a system with transfer function G(s) we have, from 
(5.66), that its response x(f) is related to the input u(t) by the transformed equation 


X(s) = G(s) U(s) 


For a unit step input 


u(t) =1H() giving U(s) = 1 
5 


Example 5.56 


Solution 


R(s) Eis) X(s) 
Oeh 


Figure 5.45 Unity 
feedback control 
system. 


Example 5.57 


Solution 
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so that 


From the final-value theorem, the steady-state gain is 


SSG = lim x(t) = lim sX(s)= lim G(s) 
toe s> s 


Determine the steady-state gain of a system having transfer function 


G(s) = 201 +35) 


8 +75+10 


The response x(f) to a unit step input u(f) — 1H(f) 1s given by the transformed equation 

X(s) 2 G(s)U(s) = 2 1 

? 7s 105 

Then, by the final-value с the steady-state gain is given by 

SSG = limx() - lim sX(s) = lim 2001339 — 

#0 52 + 75+ 10 
Note that for a step input of magnitude K, that is u(t) = KH(t), the steady-state response 
will be lim kG(s) — 2K; that is, 
s0 


steady-state response to step input 2 SSG x magnitude of step input 


A unity feedback control system having forward-path transfer function G(s), reference 
input or desired output r(f) and actual output x(f) 1s illustrated by the block diagram 
of Figure 5.45. Defining the error to be e(t) = r(t) — x(t), it follows that 


G(s)E(s) = X(s) = R(s) — E(s) 


giving 
E(s) = Rs) | 
1+G(s) 
Thus, from the final-value theorem, the steady-state error (SSE) is 
SSE = lim e(?) = lim sE(s) = lim ARSI (5.77) 
01+G(s) 


Determine the SSE for the system of Figure 5.45 when G(s) is the same as in 
Example 5.50 and r(t) is a step of magnitude К. 


Since r(t) = KH(t), we have R(s) = K/s, so, using (5.77), 


'K/s K 
SSE = lim — = 
s> 1+G(s) 1+SSG 
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34 


35 


36 


97 


where SSG = 2 as determined in Example 5.56. Thus 


SSE=!K 


It is clear from Example 5.57 that if we are to reduce the SSE, which is clearly 
desirable in practice, then the SSG needs to be increased. However, such an increase 
could lead to an undesirable transient response, and in system design a balance must be 
achieved. Detailed design techniques for alleviating such problems are not considered 
here; for such a discussion the reader is referred to specialist texts (see for example 
J. Schwarzenbach and K. F. Gill, System Modelling and Control, Edward Arnold, 


London, 1984). 


5.6.5 Exercises 


The response x(t) of a system to a forcing function 
u(t) is determined by the differential equation model 
2 
x 20,5 = 39921 
dt dt dt 
(a) Determine the transfer function characterizing 
the system. 
(b) Write down the characteristic equation of the 
system. What is the order of the system? 
(c) Determine the transfer function poles and 
zeros, and illustrate them diagrammatically in 
the s plane. 


Repeat Exercise 34 for a system whose response 
x(t) to an input u(t) is determined by the differential 
equation 


dx. cd dx аи 


= du 
C24 52841724 13x = SH so 


dt dt dt dt dt 


+6 


Which of the following transfer functions represent 
stable systems and which represent unstable systems? 


(a) s-1 (b (s 2)(s - 2) 
(s +2)(s° +4) (s+1)(s- 1)(s+4) 
(c) s=1 6 
(5 + 2)(5 +4) (s? +5 +1)(5 + 1) 


(е) 5(s +10) 
(s 5)(s? - s +10) 


Which of the following characteristic equations are 
representative of stable systems? 

(a) s°—4s+13=0 

(Б) 55° + 1352 + 315 + 15 = 0 


38 


39 


40 


(с) +52 +5+1= 0 
(d) 245+ + 115° + 265? + 455 + 36 = 0 
(е) = + 252+ 25+ 1= 0 


The differential equation governing the motion of a 
mass-spring-damper system with controller is 


3. 2 
тх e X dx Kry =0 

df df dt 
where m, с, К апа г are positive constants. Show 
that the motion of the system is stable provided that 
г < с/т. 


The behaviour of a system having a gain controller 
is characterized by the characteristic equation 
8 + 25 + (К+ 2)52 + 75+ К = 0 


where K is the controller gain. Show that the system 
is stable provided that K > 2.1. 


A feedback control system has characteristic equation 
si + 15Ks* + (2K — 1)s+5K=0 


where K is a constant gain factor. Determine the 
range of positive values of K for which the system 
will be stable. 


Determine the impulse responses of the linear 
systems whose response x(f) to an input u(f) is 
determined by the following differential equations: 


2 

(а) $4158 4 56x = 3u(2) 
dt dt 
2 

(Б) $8482 425% = u(t) 
dt dt 


42 


43 


44 


2 
(с) #®-2%-8х=4и(@) 
dt 


dt 
2 
(d) dx - 4% + 13х = и(ї) 
df dt 


What can be said about the stability of each of the 
systems? 


The response of a given system to a unit step 
u(t) = 1Н(0) is given by 


t —4t 


Sate pier d 
x(f) 21-ie +5e ge 


What is the transfer function of the system? 


Verify the initial-value theorem for the functions 


(a) 2—3cost (b) Bt- 17 (с) 1+ 3sin2t 


Verify the final-value theorem for the functions 


5.6.6 Convolution 


45 


46 


47 
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(a) 1+3e‘sin2¢ (b) 2e? 


(c) 3- 2e? x e*cos2t 
Using the final-value theorem, check the value 


obtained for i;(f) as t — e for the circuit of 
Example 5.28. 


Discuss the applicability of the final-value theorem 
for obtaining the value of i;(/) as t — «e for the 
circuit of Example 5.29. 

Use the initial- and final-value theorems to find the 
jump at / — 0 and the limiting value as t — ee for the 
solution of the initial-value problem 


12 + 5y=44+e%4+ 26(0) 


with y(0) 2 -1. 


Convolution is a useful concept that has many applications in various fields of 
engineering. In Section 5.6.7 we shall use it to obtain the response of a linear system to 
any input in terms of the impulse response. 


Definition 5.5: Convolution 


Given two piecewise-continuous functions f(t) and g(t), the convolution of f(t) and 


g(t), denoted by f « g(A), is defined as 


/+*в@) = | S(Dgtt — 1) dt 


In the particular case when f(f) and g(f) are causal functions 


0(1- 1) =0 (т> № 


Лт) = (0) =0 (т< 0), 


and we have 


f 


f*g)- | /@)г(@ — 7) dv 


(5.78) 


The notation / « g(t) indicates that the convolution f « g is a function of f; that 1s, it could 
also be written as ( f « g)(f). The integral [*, f(t) g(t — T) dt is called the convolution 
integral. Alternative names are the superposition integral, Duhamel integral, folding 


integral and faltung integral. 


Convolution can be considered as a generalized function, and as such it has many of 
the properties of multiplication. In particular, the commutative law is satisfied, so that 


f*gt)- gf) 
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or, for causal functions, 


| f(c)g(t - 7) dr — | f(t — T)g(t) dt (5.79) 


0 0 


This means that the convolution can be evaluated by time-shifting either of the two 
functions. The result (5.79) is readily proved, since by making the substitution 7, 2 т 
in (5.78) we obtain 


f*g(t)- | fE- 1)g(1)C7d7) 7 | f —- 7)g(1) dn, 7 g fA) 


Example 5.58 For the two causal functions 
fü) -tH(t, — g(t) - sin2t H(t) 
show that f» g(t) = g » f(t). 


t 


Solution rao | felt- nar= Í Tsin 2(t — T) dt 


0 0 
Integrating by parts gives 


f*g(t) ^ [3 tcos 2(t — 7) 4 Esin2(t- 1)]5— 1t— 1sin2t 


t 


вело | ЕА (1— т) ѕіп 2тат 


0 0 
= [-:(7- т) соѕ27т- 1 ѕіп27] = 17 1 іп 27 


so that f * g(t) = g » f(t). 


The importance of convolution in Laplace transform work is that it enables us to 
obtain the inverse transform of the product of two transforms. The necessary result for 
doing this 1s contained in the following theorem. 


Theorem 5.8 Convolution theorem for Laplace transforms 


If f(t) and g(t) are of exponential order o, piecewise-continuous on f = 0 and 
have Laplace transforms F(s) and G(s) respectively, then, for s — о, 


s] Keelt- 9) 7 = {f *g(t)} = F(s)G(s) 


or, in the more useful inverse form, 


X (FG)GG)) — f «gn (5.80) 


Proof 


Figure 5.46 
Regions of integration. 


Example 5.59 


Solution 
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By definition, 


F(s)G(s) = LL AD }Lig(t = | e f(x) «| | e"g) »| 


0 0 


where we have used the ‘dummy’ variables x and y, rather than t, in the integrals to 
avoid confusion. This may now be expressed in the form of the double integral 


F(s)G(s) -| | e? f(x)g( y) dx dy — | | e? f(x)g(y) dx dy 


R 
where R is the first quadrant in the (x, y) plane, as shown in Figure 5.46(a). On making 
the substitution 
х+у= і, у= т 


the double integral is transformed into 


F(s)G(s) = | | e" f(t — t)e(t)dtdt 


Ri 


where R, is the semi-infinite region in the (7, £) plane bounded by the lines t= 0 and 
T= t, as shown in Figure 5.46(b). This may be written as 


F(s)G(s) = | zi | fü - т) (т) 2 dt= | e"[g «f(0)]dr — ig * fO} 


0 





(a) Region R (b) Region Aj 


and, since convolution is commutative, we may write this as 
F(s)G(s) = LU f * g(t} 
which concludes the proof. 


end of theorem 


Using the convolution theorem, determine 2 A : 
2 2 
s(s+2) 
We express 1/s’(s + 2) as (1/s”)[1/(s + 2)”]; then, since 
1 
(5+2) 





gup-l. е) = 


с 
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5.6.7 


Figure 5.47 
Approximation to a 
continuous input. 


taking f(t) = t and g(f) = te in the convolution theorem gives 


gn i} = | f - тейт) йт= | (t- t)re?'dr 
5 (5+2) 0 0 


which on integration by parts gives 








g^ E : | = [еее т) т 167 27) - y= tft 14+ (£4 1)e?] 
5 (5+2) 
We can check this result by first expressing the given transform in partial-fractions 
form and then inverting to give 
1 i ot 1 1 
ARS DUANE = e == ZI 
5 (5+2) s s s+2 (5+2) 


so that 


4 


= 1 —2t -2t -2t 
E13 — l = lt ltte” tite” = lit- 1+(t+1)e”] 
f] 4 4 4 4 4 


as before. 


System response to an arbitrary input 


The impulse response of a linear time-invariant system is particularly useful in practice 
in that it enables us to obtain the response of the system to an arbitrary input using the 
convolution integral. This provides engineers with a powerful approach to the analysis 
of dynamical systems. 

Let us consider a linear system characterized by its impulse response A(t). Then 
we wish to determine the response x(t) of the system to an arbitrary input u(f) such as 
that illustrated in Figure 5.47(a). We first approximate the continuous function u(t) by 
an infinite sequence of impulses of magnitude u(nAT), n 2 0, 1, 2, ..., as shown in 
Figure 5.47(b). This approximation for u(t) may be written as 


щй =} щлАТ)8(1 — пАТ)АТ (5.81) 


п=0 


u(t) 





2AT 
(a) (b) 


Since the system is linear, the principle of superposition holds, so that the response of 
the system to the sum of the impulses is equal to the sum of the responses of the system 
to each of the impulses acting separately. Depicting the impulse response h(t) of the 
linear system by Figure 5.48, the responses due to the individual impulses forming the 
sum in (5.81) are illustrated in the sequence of plots in Figure 5.49. 
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Figure 5.48 


б) 
Impulse response ди) Linear i 
of a linear system. system 
1 
o 1 


A(t) 


о t 
Output 
Input 
p и(0)АТд() ATu (0) (f) 
m esq 
О t О i 
Output 
Input u(AT)AT ATu (AT) 
xd(t- AT) xh (t - AT) 
u(AT)AT [5] 
o AT t о АТ ; 
Output 
Input u(2AT)AT ATu (2AT) 
xó(t — 2AT) xh (t - 2AT) 
u(2AT)AT [5] 
? PUT ! 9 2AT t 
Output 
Input u(nAT)AT АТи(пАТ) 
Xó(t - nAT) xh (t — nAT) 
u(n At) AT 
o naT 1 o P 


Figure 5.49 Responses due to individual impulses. 


Summing the individual responses, we find that the response due to the sum of the 
impulses is 
p» u(nAT )h(t — nAT) AT (5.82) 
n=0 
Allowing AT — 0, so that nAT approaches a continuous variable T, the above sum will 
approach an integral that will be representative of the system response x(f) to the 
continuous input u(t). Thus 
оо t 
x(t) = | u(t)h(t — T) dt = | u(t)h(t— Tt) dt (since h(f) is a causal function) 
0 


0 
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That is, 
x(t) =u * h(t) 


Since convolution is commutative, we may also write 


t 


x(t) =h * u(t) = | h(v)u(t — 7) dv 


0 


In summary, we have the result that if the impulse response of a linear time-invariant 
system is A(t) then its response to an arbitrary input u(£) is 
t 


х= | u(t)h(t — T) dt= | A(t)u(t — tT) dt (5.83) 


0 0 


It is important to realize that this is the response of the system to the input u(t) assuming 
it to be initially in a quiescent state. 


Example 5.60 Тһе response 60 (f) of a system (о а driving force 6,(f) is given by the linear differential 








equation 
2 
Ce , 208 o 4 59, 6, 
dí dt 


Determine the impulse response of the system. Hence, using the convolution integral, 
determine the response of the system to a unit step input at time ¢ = 0, assuming that it 
is initially in a quiescent state. Confirm this latter result by direct calculation. 


Solution The impulse response A(f) is the solution of 
dh, „dh 


a a 


subject to the initial conditions A(0) = h(0) = 0. Taking Laplace transforms gives 
(s? - 2s - 5)H(s) 2 Z(ó(f)) 21 

so that 

HOS ес 

s +25+5 (5+1) +2 

which, on inversion, gives the impulse response as 
h(t) — 1 e"sin2t 


Using the convolution integral 


0 


t 
0.00 = | h(1)0(t — 7) dv 
with G7) = 1Н(ї) gives the response to the unit step as 


(0) = | e ^sin2r dT 


0 
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Integrating by parts twice gives 


t 


Gt) =—} e' sin 2t — e*cos2t  1— 1 esin2r dc 


0 
= —}e‘sin 2t — e‘cos2t + 1- 40,0) 
Hence 
Ө = ; (1 — e*cos2t — 1e"sin21) 


(Note that in this case, because of the simple form of O(t), the convolution integral 
fi, h(t) 0(t — T) dt is taken in preference to {}0(t)A(t — T) dt.) 
To obtain the step response directly, we need to solve for t = O0 the differential 
equation 
2 
a6 + 296, + 50, = 1 
а? dt 


subject to the initial conditions 0,(0) = 6,(0) = 0. Taking Laplace transforms gives 
(5? + 25 + 5)Ө(в) = > 
5 


so that 
= 1 _ s 1 8+2 
РЕСЕ 
5(5 + 25+ 5) s (5+1) +4 
which, on inversion, gives 
6,(t) = } — $e“(cos 2t+ I1sin2f) - 1(1— e*'cos2t — 1e'sin2f) 


confirming the previous result. 


We therefore see that a linear time-invariant system may be characterized in the 
frequency domain (or s domain) by its transfer function G(s) or in the time domain by 
its impulse response A(t), as depicted in Figures 5.50(a) and (b) respectively. The 
response in the frequency domain is obtained by algebraic multiplication, while the 
time-domain response involves a convolution. This equivalence of the operation of 
convolution in the time domain with algebraic multiplication in the frequency domain 
is clearly a powerful argument for the use of frequency-domain techniques in 
engineering design. 


U(s) X(s) u(t) x(t) 
X(s) 2 G(s) U(s) x(t) = u* h(t) 
(a) (b) 


Figure 5.50 (a) Frequency-domain and (b) time-domain representations of a linear 
time-invariant system. 
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5.6.8 Exercises 


For the following pairs of causal functions f(t) and 
g(t) show that f» e(t) = g x f(t): 


(a) 0) = 1, g(t) = cos 3t 
(b) 0 =1+1, gA =e” 
(с) Л) =, (0) = ѕіп27 
(9 Л) =е", g(t) = sint 


Using the convolution theorem, determine the 
following inverse Laplace transforms. Check your 
results by first expressing the given transform in 
partial-fractions form and then inverting using the 
standard results: 


(a) sf] 
s(s+3) 

(b) | | 
(s 2)2(5+3) 


niea] 
5 (+4) 


Taking f(A) = A and g(A) 7 e^, use the inverse form 
(5.80) of the convolution theorem to show that the 
solution of the integral equation 





y(t) = | Де? ад 


0 


51 


57 


is 
y(t) =(t- 1) +e". 


Find the impulse response of the system 
characterized by the differential equation 


2 a 
dX 47E 4 12x = u(t) 
df dt 


and hence find the response of the system to the 
pulse input u(t) = ALH(t) — H(t — T)], assuming that 
it is initially in a quiescent state. 


The response 0,(7) of a servomechanism to a driving 
force Of) is given by the second-order differential 
equation 
2 

90, 490.59 = Ө, (22 0) 

ағ dt 
Determine the impulse response of the system, 
and hence, using the convolution integral, obtain 
the response of the servomechanism to a unit step 
driving force, applied at time ¢ = 0, given that the 
system is initially in a quiescent state. 

Check your answer by directly solving the 

differential equation 


2 
6, 400, 59 = 
dt dt 


subject to the initial conditions 6,= 6, = 0 
when t=0. 


Solution of state-space equations 


In this section we return to consider further the state-space model of dynamical systems 
introduced in Section 1.9. In particular we consider how Laplace transform methods 
may be used to solve the state-space equations. 


5.7.1 SISO systems 


In Section 1.9.1 we saw that the single input-single output system characterized by the 
differential equation (1.66) may be expressed in the state-space form 


x = Ax + bu 


у=сх 


(5.84а) 
(5.84Ь) 


Example 5.61 


Solution 


5.7 SOLUTION OF STATE-SPACE EQUATIONS 451 


where x 2 x(f) 2 [x, x; . . . x,]' is the state vector and y the scalar output, the correspond- 
ing input-output transfer function model being 


Ys) . Быз +...+Ь < с 


G(s) = = 
U(s 5"+а„з +...+а, “А 


(5.85) 


where Y(s) and U(s) are the Laplace transforms of y(f) and u(t) respectively. Defining 
A and b as in (1.60), that is, we take A to be the companion matrix of the left-hand side 
of (1.66) and take b — [0 0 . . . 0 1]*. In order to achieve the desired response, the vector 
c is chosen to be 


b= [bp bp Be 0) (5.86) 

a structure we can confirm to be appropriate using Laplace transform notation. Defining 
Xs) = L£{x,(t)} and taking 
1 


n-l 


Х\(5) = я 
S +4, 18" +-+++a 


U(s) 


we have 
A59) e aX, AX) e xx) o s Xe so t а) e 
so that 
Ү(5) = Ь„Х\(5) + Ь,Х,(в) +... + b Xua(s) 
_ bot bis + bys” +... bm 


s” 
n n-l U(s) 
S Faas Esas Fag 


which confirms (5.86). 

Note that adopting this structure for the state-space representation the last row in A 
and the vector c may be obtained directly from the transfer function (5.85) by reading 
the coefficients of the denominator and numerator backwards as indicated by the 
arrows, and negating those in the denominator. 


For the system characterized by the differential equation model 
d'y, d'y dy d'u , du 
+6— +11% +3y = 5— += + 5.87 
d? dé d ^ "ah d ^ TH 
considered in Example 1.41, obtain 


(a) a transfer function model; 
(b) a state-space model 


(a) Assuming all initial conditions to be zero, taking Laplace transforms throughout 
in (5.87) leads to 


(s? + 6s? + 11s + 3)Y(s) = (5s? + s + 1)U(s) 


so that the transfer-function model is given by 


G(s) = Y(s) Е 552 +5+ 1 < с 
Us) 52+652+115+3 < А 
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0 1 0 0 
(b) Taking A to be the companion matrix A — | 0 0 1| and b — |0| then 
-3 -11 -6 1 


c=[1 1 5]' and the corresponding state-space model is given by (5.84). 


Note: The eigenvalues of the state matrix A are given by the roots of the charac- 
teristic equation | Al— A] = 2? + 64? + 114 +3 =0, which are the same as the poles 
of the transfer function G(s). 


Defining 


Lix (Ah Xs) 


Sx) _ |(5) 


#{х(0} = = X(s) 


Sx) Х,(5) 


and then taking the Laplace transform throughout in the state equation (5.84a) gives 
sX(s) — x(0) = AX(s) + bU(s) 

which on rearranging gives 
(sl - A)X(s) 2 x(0) + bU(s) 

where I is the identity matrix. Premultiplying throughout by (sl — A)! gives 
X(s) = (sl— A) 'x(0) + (sl — A) 'bU(s) (5.88) 


which on taking inverse Laplace transforms gives the response as 
x(t) = £1{(sl— A) '}x(0) + ££ ((sT— Ay'bU(s)) (5.89) 


Having obtained an expression for the system state x(f) its output, or response, y(f) may 
be obtained from the linear output equation (5.84b). 
Taking the Laplace transform throughout in (5.84b) gives 


Y(s) 2 e'X(s) (5.90) 
Assuming zero initial conditions in (5.88) we have 

X(s) 2 (sl - A) !bU(s) 
which, on substitution in (5.90), gives the input-output relationship 

Y(s) 2 c'(sl - A) !bU(s) (5.91) 
From (5.91) it follows that the system transfer function G(s) may be expressed in the form 

Toa 
G(s) - c (sh - A)!b - SERI AT 


which indicates that the eigenvalues of A are the same as the poles of G(s), as noted at 
the end of Example (5.61). It follows, from Definition 5.2, that the system is stable 
provided all the eigenvalues of the state matrix A have negative real parts. 


Example 5.62 


Solution 
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On comparing the solution (5.89) with that given in (1.81), we find that the transition 
matrix @(t) = e*’ may also be written in the form 


Ф(1) = F(s- AV} 
As mentioned in Section 1.10.3, having obtained Gt), 
(t,t) =e 


A(t-to) 
may be obtained by simply replacing t by t — t. 


Using the Laplace transform approach, obtain an expression for the state x(t) of the 
system characterized by the state equation 


a sp] Ln 
x(f) 1 -3||x«() 1 
when the input u(t) is the unit step function 


0 (t<0) 


u(t) = H(t) =f? (1 = 0) 


and subject to the initial condition x(0) 2 [1 1]F. 


In this case 


a- |. a иН: е Е 
і. 3 1 


Thus 


nam г |: деї(51— А) = (s + 1)(5 + 3) 


—1 s+3 
giving 
a 0 
(sl- A)! = 1 s+3 0 Е stl] 
(8+ 1)(5+3)| 1 +1 1 1 1 











2(5+1) 205-3) 5+3 
which, on taking inverse transforms, gives the transition matrix as 
e” 0 


—3t -3t 
e e 


e“ = £"{(sl- A) '}= 
e'-1 


Nin 


so that the first term in the solution (5.89) becomes 


Pt (sl= AY lx) = = (5.92) 
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Since U(s) 2 Z(H(t)) — Vs, 


оо ees gea 
(sl - A) но | 1 llth 


m 1 5+3 
Os(stDG3)|s«2 





t-l 
_ s stl 
2. 1 С 1 
3s 2(s+1) 6(s+3) 


so that the second term in (5.89) becomes 


L'f(s1-A)'bU(s)} = (5.93) 


5.7.2 Exercises 


53 A system is modelled by the following differential 
equations 
X, + 5x, +x) = 2u 
X_ — 3x, +x, = Su 56 
coupled with the output equation 
y=x +2% 


Express the model in state-space form and obtain 
the transfer function of the system. 


54 Find the state-space representation of the second 
order system modelled by the transfer function 


G(s) = Y(s) _ ; s Tl 
U(s) 5° +75 +6 
55 Obtain the dynamic equations in state-space form 
for the systems having transfer-function models 


52 +35+2 


2 
8 +36565 (b 3 s 
8 +45 +3 


5° +68 +55+7 


(а) 





using the companion form of the system matrix in 
each case. 


In formulating the state-space model (5.84) it is 
sometimes desirable to specify the output y to 

be the state variable x,; that is, we take 

c-[ 0 O]’. If A is again taken to be 
the companion matrix of the denominator then it 
can be shown that the coefficients b,, b,,..., 6, of 
the vector b are determined as the first n coefficients 
in the series in s ! obtained by dividing the 
denominator of the transfer function (5.85) into the 
numerator. Illustrate this approach for the transfer- 
function model of Figure 5.51. 


Us) 55324541 Ms) 
53 +0652 + 115+ 6 


Figure 5.51 Transfer-function model of Exercise 56. 
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A system is governed by the vector-matrix 60 
differential equation 


X(t) = : ^ x) = (1) (20) 
SERT e. 1 E B 


where x(t) and u(t) are respectively the state 
and input vectors of the system. Use Laplace 
transforms to obtain the state vector x(t) for the 
input u(f) 2 [A 3]" and subject to the initial 
condition x(0) 2 [1 2]F. 


Given that the differential equations modelling a 
certain control system are 

Xj-xj-3x,tu 

х= 2х - 4 +и 


use (5.89) to determine the state vector 

x-[x x] for the control input u = e”, 
applied at time ¢ = 0, given that x, = x, = 1 at time 
t=0. 


Using the Laplace transform approach, obtain 
an expression for the state x(/) of the system 
characterized by the state equation 


EE 


(t z 0) 
where the input is 
61 
0 t«0 
ub -| ( ) 
e” (t=0) 


and subject to the initial condition x(0) 2 [1 0J". 


5.7.3 MIMO systems 


A third-order single-input-single-output system is 
characterized by the transfer-function model 


Ү(з) _ __35`+25+1 
Us) 5 +65 +115+6 


Express the system model in the state-space form 


х= Ах + bu (5.94а) 


у= сїх (5.940) 


where A is in the companion form. By making a 
suitable transformation x = Mz, reduce the state- 
space model to its canonical form, and comment 
on the stability, controllability and observability 
of the system. 

Given that 


(i) a necessary and sufficient condition for 
the system (5.94) to be controllable is 
that the rank of the Kalman matrix 
[b Ab Ab А” 1Ь] Бе the same 
as the order of A, and 

(ii) a necessary and sufficient condition for it to 
be observable is that the rank of the Kalman 
matrix [e Ae (Alye (АТ) !с] be 
the same as the order of A, 


evaluate the ranks of the relevant Kalman matrices 
to confirm your earlier conclusions on the 
controllability and observability of the given 
system. 


Repeat Exercise 60 for the system characterized by 
the transfer-function model 


yr 43545 
s+ 6s +58 


As indicated in (1.69) the general form of the state-space model representation of an 
nth-order multi-input-multi-output system subject to r inputs and / outputs is 


х= Ах + Ви 
у= Сх+ ри 


(5.95а) 
(5.95) 


where x is the n-state vector, u is the r-input vector, y is the /-output vector, A is the 
n x n system matrix, B is the n x r control (or input) matrix and C and D are respectively 
Ix n and / x r output matrices, with the matrix D relating to the part of the input that is 


applied directly into the output. 
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Defining 
SO») (Y) 
oe, Б Ег 
S(»0) — (Yi) 
{ио [иф 
б) = |20032 Еш да) 


Zu] U,(s) 
and taking Laplace transforms throughout in the state equation (5.95a), following the 
same procedure as for the SISO case, gives 
X(s) = (sl— A) 'x(0) + (sl -— A)'BU(s) (5.96) 
Taking inverse Laplace transforms in (5.96) gives 
x(t) = £^ ((sE- A) !1x(0) - £'((sI- AY! BU(s)! (5.97) 


The output, or response, vector y(t) may then be obtained directly from (5.95b). 

We can also use the Laplace transform formulation to obtain the transfer matrix 
G(s), between the input and output vectors, for a multivariable system. Taking Laplace 
transforms throughout in the output equation (5.95b) gives 


Y(s) 2 CX(s) + DU(s) (5.98) 
Assuming zero initial conditions in (5.96) we have 

X(s) 2 (sl - A) !'BU(s) 
Substituting in (5.98), gives the system input-output relationship 

Y(s) 2 [C(s - Ay!B « D]U(s) 


Thus the transfer matrix G(s) model of a state-space model defined by the quadruple 
(A, B, C, Di is 


G(s) = C(sl- A)'B + р (5.99) 


The reverse problem of obtaining a state-space model from a given transfer matrix 
is not uniquely solvable. For example, in Section 1.10.6 we showed that a state-space 
model can be reduced to canonical form and indicated that this was without affecting the 
input-output behaviour. In Section 1.10.6 it was shown that under the transformation 
x = Tz, where T is a non-singular matrix, (5.95) may be reduced to the form 


2=Az+Bu 
у=@ би (5.100) 
where z is now a state vector and 


А= ТЗАТ, В = ТІВ, С= СТ, б=р 


Example 5.63 


Figure 5.52 Network 
of Example 5.63. 


Solution 
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From (5.99), the input-output transfer matrix corresponding to (5.100) is 

G(s) = C(sI- A B« D 
=CT(slI- T'AT)'T'B+D 
=CT(sT'T-T'AT)'T'B+D 
- CT[T 's1I- AOT|'T B-«D 
-CT[T (s- Ay'T]T B-«D (using the commutative property) 
= C(sl- Ay'B + D 
= G(s) 


where G(s) is the transfer matrix corresponding to (5.95), confirming that the input-output 
behaviour of the state-space model defined by the quadruple {A, B, C, D} is the same as 
that defined by the quadruple {A, B, C, D}. The problem of finding state-space models 
that have a specified transfer-function matrix is known as the realization problem. 

It follows from (5.99) that 


.. Cadj(sl - A)B 
ucdu co Y 


Clearly, if s 2 p is a pole of G(s) then it must necessarily be an eigenvalue of the state 
matrix A, but the converse is not necessarily true. It can be shown that the poles of G(s) 
are identical to the eigenvalues of A when it is impossible to find a state-space model 
with a smaller state dimension than having the same transfer-function matrix. In such 
cases the state space model is said to be in minimal form. 


(a) Obtain the state-space model characterizing the network of Figure 5.52. Take the 
inductor current and the voltage drop across the capacitor as the state variables, 
take the input variable to be the output of the voltage source, and take the output 
variables to be the currents through L and R, respectively. 


(b) Find the transfer-function matrix relating the output variables y, and y; to the 
input variable u. Thus find the system response to the unit step u(t) = H(t), assuming 
that the circuit is initially in a quiescent state. 






EU R,-550 
‚= 5. 


(a) The current i, in the capacitor is given by 
ig = Coo - CX, 
Applying Kirchhoff's second law to the outer loop gives 


е = Ry(ij * ic) * vc t Rjic 9 Rio * CX) * x, * R;CxX, 


458 LAPLACE TRANSFORMS 


(b) 


leading to 


oe E d eee x 
CORR) OU, FR) CUN Rj) 


Хр = 


Applying Kirchhoff's second law to the left-hand loop gives 


e= R (i, + ic) + Li, = R(x, + CX) * Lx, 





leading to 

pered x,- Rik х,+ К, 
L(R, + R5) L(R, 4 R5) LR,+R, 

Also, 

Vi =X 

yn = C =- l S +— 











Хут X5 
R, +R, R, +R, R, +R, 


Substituting the given parameter values leads to the state-space representation 


E ДҮ 
(ЕШ 


which is of the standard form 


юш 


¥2 


= 
je i 
| 


х= Ах + bu 
у= Сх+ йи 


From (5.99), the transfer-function matrix G(s) relating the output variables y, and 
yı to the input u is 


G(s) = C(sl- A)'b+d 


Now 
sl-A= 5+2 4 
—2 5+11 
giving 
(51 -Ay! _ 1 8 +11 —4 
(s 3)(s-10)| 2 s+2 
Cl _ Ay b Z uum п | 58+ 11 —4 : 
(st3)5*10|-2 -£|| 2  ss2|| 
1 Es 15 


7 (8+3)(8+10) 255-4 
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so that 
Ыз + 15 
OMNES | ш le) eee 
(з + 3)(5+ 10) —5-4 2 —%5_ 4 : 


— + 
(5+3)(5+10) 95 


The output variables y, and y; are then given by the inverse Laplace transform of 
Y(s) 2 G(s)U(s) 
where U(s) 2 Z[u(t)] 2 Z[H(t)] ^ ls; that is, 


Hs 15 
s(s+3)(s +10) 
ees 
S(s+3)(s+10) 15s 


Ү(5) = 


1 4 


1 L 4 
2 14 2 1 

s st+3 +10 

2 2 A 2 
Booy g 


s s+3 s+10 s 





which on taking inverse Laplace transforms gives the output variables as 


l4 La 4-10 
Ji a+ ue 76 
= (t= 0) 
2 .-3t 4 _-10t 
y2 Cae tae 


In MATLAB the function t £2ss can be used to convert a transfer function to state- 
space form for SISO systems. At present there appears to be no equivalent function 
for MIMO systems. Thus the command 


Др л ЫБ ЕЕ iEdESIS (197 8) 


returns the A, B, C, D matrices of the state-form representation of the transfer function 
E b Hes i MER D М D 
Еа) ЕСК EE 


cn EE COP TN 


where the input vector a contains the denominator coefficients and the vector b 
contains the numerator coefficients, both in ascending powers of s. 

(Note: The function t £ss can also be used in the case of single-input-multi-output 
systems. In such cases the matrix numerator must contain the numerator coefficients 
with as many rows as there are outputs.) 

To illustrate consider the system of Example 5.61, for which 


5s +541 


О чусу 
s +6s +11s+3 
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In this case the commands: 


b [[Б xs 
а ПОМ ТЕ 
[ИЕ ОЛЕШЕ ШЕ сс се, 


return 

Ae = On lS) 
ail 0 0 
0 i @ 

ЕБ = 
0 
0 

quc be que 

iB = [ 


giving the state-space model 


Xi -6 -11 -3||xi 1 ХІ 
E EMT 0 Ojx|*owy-[5 1 ур 
fis 0 1 Ox 10 iy 


(Note: This state-space model differs from the one given in the answer to Example 
5.61. Both forms are equivalent to the given transfer function model, with an alter- 
native companion form taken as indicated in Section 1.9.1.) 

Likewise, in MATLAB the function ss2t f converts the state-space representa- 
tion to the equivalent transfer function/matrix representation (this being applicable 
to both SISO and MIMO systems). The command 


[б = dis (als C. IB. abun) 
returning the transfer function/matrix 
G(s) = C(sl- A)'B + D 


from the iu-th input. Again the vector a contains the coefficients of the denominator 
in ascending powers of s and the numerator coefficients are returned in array b with 
as many rows as there are outputs. 

(Note: The entry iu in the command can be omitted when considering SISO 
systems so, for example, the commands: 


СИТЕТ О 


ig — (9 5.09000 3.500010 Та Оу 
а QOL OEC COOL CO MNT e n C CP CC M e ШО 
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giving the transfer function representation 


552 +5+1 


Е ems 
5 +65 +115 +3 


which confirms the answer to the above example. As an exercise confirm that the 
state-space model obtained in the answer to Example 5.61 1s also equivalent to this 
transfer function representation.) 

To illustrate a MIMO system consider the system in Exercise 63, in which the 
state-space model is 


й 0 1 о olm [o 0 
| _|-1-1 0 | 
X3 0 0 1 {хз 0 0ш, 
и ОЕ ОЧ 
Xi 
“| _ (| ®@ EO O 
ОН | 0 1 o р 
X4 


and we wish to determine the equivalent transfer matrix. The commands: 


ЕЕ ПОТ ООСН E o CO ©) ll) et ГИ] 
ЕО ОТО ООО 
C=[L 0 @ Op O W I OF 
[SUN @ 70 Wile 
Hem) = (QU BC, D, 1) 
return the response to u 


lol = @ CEO CO ЖЕ ООО ШИШЕ ОООО 
0 0.0000 0.0000 1.0000 0.0000 
e  dES(0(010/0. — 225 0/0(0) — 21.190000) 20000 WOOO 


and the additional command 
л а = 567 ЕК VA, 121 1C,, ID, 2) 
returns response to u, 


ОООО ОО ОО ООО 1.0000 0.0000) 
() (0.00000) — 3L. (00/010) — 3L. (00(00) — ШОО) 
A = 10000 2.0000 22500010 2.0000 1 (0000) 


leading to the transfer matrix model 


Gis) = 1 PE 5 | 
1 


4 3 2) 
s +25 +25 +2s+1 s Rome 


1 SUR Rd S 
(s D'(s-1) s sui 
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62 


63 


64 


5.7.4 Exercises 


Determine the response y = x, of the system 
governed by the differential equations 


X, =—2x, + u, - uy | (i 


Xy =X, - 3x, +uU,4+ Uy 


= 0) 


to aninputw=[u, w]'=[1  f]' and subject to 
the initial conditions x,(0) = 0, x,(0) = 1. 


Consider the 2-input—2-output system modelled by 
the pair of simultaneous differential equations 





Ýi +ý 7y +y =u 











Ýa tý- ý + V2 = Uy 


Taking the state vector tobex=[y, Ж yə J 
express the model as a state-space model of the form 


х= Ах + Ви 
у= Сх 


Determine the transfer matrix and verify that its 
poles are identical to the eigenvalues of the state 
matrix A. 


Considering the network of Figure 5.53 
(a) Determine the state-space model in the form 
х= Ах + Bu 


у= Сх 











Figure 5.53 Network of Exercise 64. 


Take the inductor currents in L,, L, and L; as 
the state variables x;, x;, x; respectively; take 
the input variables u, and и, to be the outputs 
of the current and voltage sources respectively; 
and take the output variables y, and y, to be the 
voltage across R, and the current through L, 
respectively. 


(b) Determine the transfer matrix G(s) relating the 
output vector to the input vector. 


(c) Assuming that the circuit is initially in a 
quiescent state, determine the response y(t) 
to the input pair 


шд = Н) 
ujt) - tH) 


where H(t) denotes the Heaviside function. 


5.8 Engineering application: frequency response 


Frequency-response methods provide a graphical approach for the analysis and design 
of systems. Traditionally these methods have evolved from practical considerations, 
and as such are still widely used by engineers, providing tremendous insight into over- 
all system behaviour. In this section we shall illustrate how the frequency response can 
be readily obtained from the system transfer function G(s) by simply replacing s by jæ. 
Methods of representing it graphically will also be considered. 

Consider the system depicted in Figure 5.41, with transfer function 


G(s) = 


(m x n) 


(5.101) 


(s- pi)(S- po)... (8- Pn) 


When the input is the sinusoidally varying signal 


u(t) = A sin ot 
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applied at time ¢ = 0, the system response x(f) for t 2 0 is determined by 
X(s) 2 G(s).Z(A sin wt} 
That is, 





AQ 
s 
_ КА@(58-2|)(8-2›)...($-—7) 
© (8-ру)(8- р»)... (%—р„)(5 - jo)(s + jo) 


X(s) 2 G(s) 


which, on expanding in partial fractions, gives 


Х(з)= -©—+-®—+у -Ё. 
S-jJO 5+]ЈО “~ S- Pi 


where œ, œ, Bi» B2» - - - , B, are constants. Here the first two terms in the summation are 

generated by the input and determine the steady-state response, while the remaining 

terms are generated by the transfer function and determine the system transient response. 
Taking inverse Laplace transforms, the system response x(f), t = 0, is given by 


n 
х) = ое! + о e 3" + У Be" (12 0) 
tl 


In practice we are generally concerned with systems that are stable, for which the poles 
P, i=1,2,..., n, ofthe transfer function G(s) lie in the left half of the s plane. 
Consequently, for practical systems the time-domain terms f, e"", j— 1, 2, ..., n, decay 
to zero as ¢ increases, and will not contribute to the steady-state response x,.(f) of the 
system. Thus for stable linear systems the latter is determined by the first two terms as 


Xs (À = o e+ æ, e 


Using the ‘cover-up’ rule for determining the coefficients o, and œ, in the partial- 
fraction expansions gives 


= С = о _ 2000) 


=| ET jos + jo) 


(s - j@)(s + jo) 


о, = ie: Tes] Е -GCjo 
s--jo 


so that the steady-state response becomes 
— A 2 jot A à -jot 
xy(f) 2 — G(jo)e"' - — G(-jo)e (5.102) 
2j 2j 


G( JO) can be expressed in the polar form 
GL ja) = |GCjo)| ei eoo 


where |G(jq@)| denotes the magnitude (or modulus) of G(j@). (Note that both the 
magnitude and argument vary with frequency @.) Then, assuming that the system has 
real parameters, 


G(-jo) = 1GCjo)| eec 
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Example 5.64 


Solution 


and the steady-state response (5.102) becomes 
= A = jarg G( jo)] ajot A : -j arg G(j@)] &-jot 
x«(f) 7 = [|G(ja) |e PU") el — = [|G(ja)le Je 
2j 2j 
A = j[ot--arg G( jo) —j[wt+arg G( jo) 
= =|G(jo)|[e! gGj )] — el gGj Л] 
2] 


That is, 
X(t) = A |G(jo)| sin [wt + arg G(jo)] (5.103) 


This indicates that if a stable linear system with transfer function G(s) is subjected to a 
sinusoidal input then 


(a) the steady-state system response is also a sinusoid having the same frequency @ 
as the input; 

(b) the amplitude of this response is |G(jq@)| times the amplitude A of the input 
sinusoid; the input is said to be amplified if |G(j@)| > 1 and attenuated if 
|G(jo)| < 1; 

(c) the phase shift between input and output is arg G(j@). The system is said to lead 
if arg G(j@) > 0 and lag if arg G(j@) < 0. 


The variations in both the magnitude |G(jq@)| and argument arg G(jq@) as the fre- 
quency æ of the input sinusoid is varied constitute the frequency response of the 
system, the magnitude |G(j@)| representing the amplitude gain or amplitude ratio of 
the system for sinusoidal input with frequency @, and the argument arg G(j@) represent- 
ing the phase shift. 

The result (5.103) implies that the function G(j@) may be found experimentally by 
subjecting a system to sinusoidal excitations and measuring the amplitude gain and 
phase shift between output and input as the input frequency is varied over the range 
0 << @ < оо. In principle, therefore, frequency-response measurements may be used to 
determine the system transfer function G(s). 

In Chapters 7 and 8, dealing with Fourier series and Fourier transforms, we shall see 
that most functions can be written as sums of sinusoids, and consequently the response 
of a linear system to almost any input can be deduced in the form of the corresponding 
sinusoidal responses. It is important, however, to appreciate that the term ‘response’ in 
the expression ‘frequency response’ only relates to the steady-state response behaviour 
of the system. 

The information contained in the system frequency response may be conveniently 
displayed in graphical form. In practice it is usual to represent it by two graphs: one 
showing how the amplitude |G(jq@)| varies with frequency and one showing how the 
phase shift arg G(j@) varies with frequency. 


Determine the frequency response of the RC filter shown in Figure 5.54. Sketch the 
amplitude and phase-shift plots. 


The input—output relationship is given by 


1 
E ———— E. 
o(s) RCs +1 iG) 
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R so that the filter is characterized by the transfer function 


е0 d e(t) б) = == 
S 


Figure 5.54 RC filer. Therefore 





da). mi c PejROO 
RCjO+1 1+ К? Сар 


1 . RCo 


= — e] 
l4¢R Ca d4R Cw 


giving the frequency-response characteristics 


amplitude ratio = | G(ja)| 


(1+RCo)y (1+RCa) 
a _ 
(+R Co) 
phase shift = arg G(j@) = —tan (RC@) 

Note that for @= 0 

|G(jo)| = 1, arg G( jo) = 0 
and as @ > © 

|G(ja)| > 0, arg G(j@)  -in 


Plots of the amplitude and phase-shift curves are shown in Figures 5.55(a) and (b) 
respectively. 


Figure 5.55 arg G(jw) 

Frequency-response 90° 

plots for Example 5.64: 

(a) amplitude plot; 

(b) phase-shift plot. o 
-45? 
—90° 





(b) 


For the simple transfer function of Example 5.64, plotting the amplitude and phase- 
shift characteristics was relatively easy. For higher-order transfer functions it can be 
a rather tedious task, and it would be far more efficient to use a suitable computer 
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Example 5.65 


Solution 


package. However, to facilitate the use of frequency-response techniques in system 
design, engineers adopt a different approach, making use of Bode plots to display 
the relevant information. This approach is named after H. W. Bode, who developed 
the techniques at the Bell Laboratories in the late 1930s. Again it involves drawing 
separate plots of amplitude and phase shift, but in this case on semi-logarithmic graph 
paper, with frequency plotted on the horizontal logarithmic axis and amplitude, or phase, 
on the vertical linear axis. It is also normal to express the amplitude gain in decibels 
(dB); that is, 


amplitude gain in dB = 20 log|G(j@)| 
and the phase shift arg G(j@) in degrees. Thus the Bode plots consist of 


(a) a plot of amplitude in decibels versus log œ, and 

(b) a plot of phase shift in degrees versus log o. 

Note that with the amplitude gain measured in decibels, the input signal will be 
amplified if the gain is greater than zero and attenuated if it is less than zero. 

The advantage of using Bode plots is that the amplitude and phase information can 
be obtained from the constituent parts of the transfer function by graphical addition. It 
is also possible to make simplifying approximations in which curves can be replaced by 
straight-line asymptotes. These can be drawn relatively quickly, and provide sufficient 
information to give an engineer a ‘feel’ for the system behaviour. Desirable system 
characteristics are frequently specified in terms of frequency-response behaviour, and 
since the approximate Bode plots permit quick determination of the effect of changes, 
they provide a good test for the system designer. 


Draw the approximate Bode plots corresponding to the transfer function 


— 4x 10°(5+5)_ 
G5) = 00+ 5)(20 +5) (5.104) 


First we express the transfer function in what is known as the standard form, namely 


10(1 +. 0.2s 
G(s) = WC + 0.28) _ 
(s) = 7 40.01s)(1 + 0.058) 


giving 
. 10(1 +j0.2@ 
(o) — 1X1. * jO016) (1 j0.05) 
Taking logarithms to base 10, 


20100 |С(јо)| = 2010010 + 20100 |1 + ј0.20| – 20100 |јо | 
– 20100 |1 + ј0.010| – 20100 |1 + 0.050] 
аге С( јо) = аге 10 + аго (1 + Ј0.20) – аге јо – аге (1 + 0.010) 
— агр(1 + ј0.050) (5.105) 
The transfer function involves constituents that are again a simple zero and simple 


poles (including one at the origin). We shall now illustrate how the Bode plots can be 
built up from those of the constituent parts. 
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20 log I1 + jel 






Corner frequency 


w= l/t i 7.7 10 
| A 22] 20 dB/decade 2 
= -5 


2 


Consider first the amplitude gain plot, which is a plot of 20 log | G( j@)| versus log @: 


(a) fora simple gain k a plot of 20 logs is a horizontal straight line, being above the 
0 dB axis if k > 1 and below it if k — 1; 


(b) fora simple pole at the origin a plot of —20 log c is a straight line with slope 
—20 dB/decade and intersecting the 0 dB axis at w= 1; 


(c) fora simple zero or pole not at the origin we see that 


: 0 as @ > 0 
20 log|1+jt@| > 
20 log tT@ = 20 logw- 20 log(1/t) as @— © 


Note that the graph of 20 log Ta is a straight line with slope 20 dB/decade and inter- 
secting the 0 dB axis at @= 1/t. Thus the plot of 20 log|1 + jt@| may be approximated 
by two straight lines: one for @ < 1/тапа опе for o — 1/7. The frequency at intersection 
0 = 1/т15 саПеа the breakpoint or corner frequency; here |1 + jt@| = 42, enabling the 
true curve to be indicated at this frequency. Using this approach, straight-line approxima- 
tions to the amplitude plots of a simple zero and a simple pole, neither at zero, are 
shown in Figures 5.56(a) and (b) respectively (actual plots are also shown). 


20 108 11 + јот 





-10 
wk -15 —20 аВ/десай© 
-20 
34 5 10 20 2 20 
wirad s~! wirad s7! 
(a) (b) 


Figure 5.56 Straight-line approximations to Bode amplitude plots: (a) simple zero; (b) simple pole. 


Using the approximation plots for the constituent parts as indicated in (a)—(c) ear- 
lier, we can build up the approximate amplitude gain plot corresponding to (5.104) by 
graphical addition as indicated in Figure 5.57. The actual amplitude gain plot, produced 
using a software package, is also shown. 

The idea of using asymptotes can also be used to draw the phase-shift Bode plots, 
again taking account of the accumulated effects of the individual components making 
up the transfer function, namely that 


(i) the phase shift associated with a constant gain k is zero; 


(ii) the phase shift associated with a simple pole or zero at the origin is +90° or —90° 
respectively; 


(iii) for a simple zero or pole not at the origin 


0 аѕ 0 0 


ап! (от) > 
90? aso — ee 


tan”'(@T) = 45° when @т= 1 
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Figure 5.57 log |G (jw) | 40 
Amplitude Bode 
plots for the G(s) 
of Example 5.65. 30 
20 
10 
0 
-10 
-20 i 
10-1! 100 10! 102 10° 
wlrad s~! 





—::— Approximate plot Actual plot 


With these observations in mind, the following approximations are made. For fre- 
quencies @ less than one-tenth of the corner frequency @= 1/t (that is, for @ < 1/107) 
the phase shift is assumed to be 0°, and for frequencies greater than ten times the 
corner frequency (that is, for c — 10/7) the phase shift is assumed to be +90°. For 
frequencies between these limits (that is, 1/107 —« л < 10/7) the phase-shift plot is 
taken to be a straight line that passes through 0° at œ = 1/107, +45° at w= 1/т, апа +90° 
at œ= 10/1. In each case the plus sign is associated with a zero and the minus sign with 
a pole. With these assumptions, straight-line approximations to the phase-shift plots for 
a simple zero and pole, neither located at the origin, are shown in Figures 5.58(a) and 
(b) respectively (the actual plots are represented by the broken curves). 

Using these approximations, a straight-line approximate phase-gain plot correspond- 
ing to (5.105) is shown in Figure 5.59. Again, the actual phase-gain plot, produced using 
a software package, is shown. 


Figure 5.58 arg [I/I + jwt) 
Approximate Bode 

phase-shift plots: аге (1 + јат) 
(a) simple zero; 

(b) simple pole. 90° 


45° 





0° 


1 І 10 w 


о 
S 
n 
“| 


(а) (b) 


Figure 5.59 
Phase-shift Bode 
plot for the G(s) 
of Example 5.65. 


Е) 
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arg G(jw) 90° 


50° 








arg (1 + j0.01w) 


-50° 
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wlrad s7! 








































































































—--— Approximate plot 





Actual plot 


In MATLAB the amplitude and phase-gain plots are generated using the commands 
с С 
СЕЛИ БИСЕ (Л анг) (ан } 2 
bode (G) 


In the graphical approach adopted in this section, separate plots of amplitude gain 
and phase shift versus frequency have been drawn. It is also possible to represent the 
frequency response graphically using only one plot. When this is done using the pair of 
polar coordinates (|G(j@)|, arg G(j@)) and allowing the frequency o to vary, the resulting 
Argand diagram is referred to as the polar plot or frequency-response plot. Such a 
graphical representation of the transfer function forms the basis of the Nyquist approach 
to the analysis of feedback systems. In fact, the main use of frequency-response methods 
in practice is in the analysis and design of closed-loop control systems. For the unity 
feedback system of Figure 5.45 the frequency-response plot of the forward-path 
transfer function G(s) is used to infer overall closed-loop system behaviour. The Bode 
plots are perhaps the quickest plots to construct, especially when straight-line approx- 
imations are made, and are useful when attempting to estimate a transfer function 
from a set of physical frequency-response measurements. Other plots used in practice 
are the Nichols diagram and the inverse Nyquist (or polar) plot, the first of these 
being useful for designing feedforward compensators and the second for designing 
feedback compensators. Although there is no simple mathematical relationship, it is 
also worth noting that transient behaviour may also be inferred from the various frequency- 
response plots. For example, the reciprocal of the inverse M circle centred on the —1 
point in the inverse Nyquist plot gives an indication of the peak over-shoot in the transient 
behaviour (see, for example, G. Franklin, D. Powell and A. Naeini-Emami, Feedback 
Control of Dynamic Systems, Reading, MA, Addison-Wesley, 1986). 


Investigation of such design tools may be carried out in MATLAB, incorporating 
Control Toolbox, using the command ritool (G). 
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5.9.1 


5.9.2 


In Chapter 1 we examined the behaviour of linear continuous-time systems modelled in 
the form of vector-matrix (or state-space) differential equations. In this chapter we have 
extended this, concentrating on the transform domain representation using the Laplace 
transform. In Chapter 6 we shall extend the approach to discrete-time systems using the 
z-transform. So far we have concentrated on system analysis; that is, the question ‘Given 
the system, how does it behave?’ In this section we turn our attention briefly to consider 
the design or synthesis problem, and while it is not possible to produce an exhaustive 
treatment, it is intended to give the reader an appreciation of the role of mathematics in 
this task. 


Poles and eigenvalues 


By now the reader should be convinced that there is an association between system 
poles as deduced from the system transfer function and the eigenvalues of the system 
matrix in state-space form. Thus, for example, the system modelled by the second-order 
differential equation 


d d 
чы Ле уц 


d? dt 


has transfer function 


1 
G(s)- yc : 
Ў 3505—05 


The system can also be represented in the state-space form 
x = Ax + bu, yscx (5.106) 


where 


x= [x sl A- 





0 | : : 
; b=[0 1], с= [1 0] 


It is easy to check that the poles of the transfer function G(s) are at s 2 —1 and s — 2, 
and that these values are also the eigenvalues of the matrix A. Clearly this is an 
unstable system, with the pole or eigenvalue corresponding to s — i located in the 
right half of the complex plane. In Section 5.9.2 we examine a method of moving this 
unstable pole to a new location, thus providing a method of overcoming the stability 
problem. 


The pole placement or eigenvalue location technique 


We now examine the possibility of introducing state feedback into the system. To do 
this, we use as system input 


и= Кх + им 
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where k 2 [k, &,]' and u,,, is the external input. The state equation in (5.106) then 


ext 


becomes 
: 0 1 0 
х= |; 1х + [х + х) + ил] 
j 7» 1 
That is, 
. 0 1 0 
Х= х + es 
К+; h-i 1 


Calculating the characteristic equation of the new system matrix, we find that the 
eigenvalues are given by the roots of 


А2 (6 1)А- (6+1) =0 


Suppose that we not only wish to stabilize the system, but also wish to improve the 
response time. This could be achieved if both eigenvalues were located at (say) A = —5, 
which would require the characteristic equation to be 


WV + 101+ 25 = 0 
In order to make this pole relocation, we should choose 


-(& — 5) = 10, -(6+ 2) = 25 


indicating that we take k,= -2 and k, = -B . Figure 5.60 shows the original system and 


the additional state-feedback connections as dotted lines. We see that for this example 
at least, it is possible to locate the system poles or eigenvalues wherever we please in 
the complex plane, by a suitable choice of the vector А. This corresponds to the choice 
of feedback gain, and in practical situations we are of course constrained by the need 
to specify reasonable values for these. Nevertheless, this technique, referred to as pole 
placement, is a powerful method for system control. There are some questions that 
remain. For example, can we apply the technique to all systems? Also, can it be extended 
to systems with more than one input? The following exercises will suggest answers to 
these questions, and help to prepare the reader for the study of specialist texts. 


Figure5.60 Feedback — | . . ^ © 
connections for eigen- 


1 
t 
+ 
value location. ] 
toD 
1 y 
1 i 


Hex (1) 
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5.9.3 Exercises 


An unstable system has Laplace transfer function 


1 


Hh) = = 
ts) (s D - 1) 


Make an appropriate choice of state variables to 
represent this system in the form 
у=с'х 


х = Ах + Би, 


where 


68 


Š 
get 
1 

о 

Ni= = 


Nie 


This particular form of the state-space model in 
which A takes the companion form and b has a 
single 1 in the last row is called the control 
canonical form of the system equations, and 
pole placement is particularly straightforward 
in this case. 

Find a state-variable feedback control of the 
form u — K'x that will relocate both system poles 
at s = —4, thus stabilizing the system. 


Find the control canonical form of the state-space 
equations for the system characterized by the 
transfer function 


-— 2 
(s+1)(s+4) 


G(s) 


Calculate or (better) simulate the step response 

of the system, and find a control law that relocates 
both poles at s = —5. Calculate or simulate the step 
response of the new system. How do the two 
responses differ? 


The technique for pole placement can be adapted 
to multi-input systems in certain cases. Consider 
the system 69 


х= Ах + Ви, 
where 


xj. 


"be 


x= [x 


Writing Bu = Буи, + bu; where b; 2 [1 1]' and 
Ь, = [0 1]! enables us to work with each input 
separately. As a first step, use only the input u, 

to relocate both the system poles at s = —5. 
Secondly, use input u, only to achieve the same 
result. Note that we can use either or both inputs 
to obtain any pole locations we choose, subject of 
course to physical constraints on the size of the 
feedback gains. 


The bad news is that it is not always possible to 
use the procedure described in Exercise 67. In the 
first place, it assumes that a full knowledge of the 
state vector x(f) is available. This may not always 
be the case; however, in many systems this problem 
can be overcome by the use of an observer. For 
details, a specialist text on control should be 
consulted. 

There are also circumstances in which the 
system itself does not permit the use of the 
technique. Such systems are said to be 
uncontrollable, and the following example, which 
is more fully discussed in J. G. Reed, Linear System 
Fundamentals (McGraw-Hill, Tokyo, 1983), 
demonstrates the problem. Consider the system 


“ОЁ 


y=[0 lx 


with 


Find the system poles and attempt to relocate both 
of them, at, say, s = —2. It will be seen that no gain 
vector k can be found to achieve this. Calculating 
the system transfer function gives a clue to the 
problem, but Exercise 69 shows how the problem 
could have been seen from the state-space form of 
the system. 


In Exercise 60 it was stated that the system 
x=Ax+ bu 
у= сїх 


where A is ann Xn matrix, is controllable provided 
that the Kalman matrix 
Ab 


M=[b Ab А” Б] 


70 


is of rank n. This condition must be satisfied if 
we are to be able to use the procedure for pole 
placement. Calculate the Kalman controllability 
matrix for the system in Exercise 68 and confirm 
that it has rank less than n = 2. Verify that the 
system of Exercise 65 satisfies the controllability 
condition. 


We have noted that when the system equations 
are expressed in control canonical form, the 
calculations for pole relocation are particularly 
easy. The following technique shows how to 
transform controllable systems into this form. 
Given the system 


x = Ax + bu, у= сїх 


calculate the Kalman controllability matrix M, 
defined in Exercise 69, and its inverse M'. 
Note that this will only exist for controllable 
systems. Set v' as the last row of M™ and form 
the transformation matrix 
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Л: -1 
vA 


A transformation of state is now made by 
introducing the new state vector z(/) 2 Tx(t), and the 
resulting system will be in control canonical form. 
To illustrate the technique, carry out the procedure 
for the system defined by 


el: e 


and show that this leads to the system 


ТЕ 


Finally, check that the two system matrices have 
the same eigenvalues, and show that this will 
always be the case. 


5.10 Review exercises (1-34) 


Check your answers using MATLAB or MAPLE whenever possible. 


i 


2 


Solve, using Laplace transforms, the following 
differential equations: 


2 
(a) $8 44% 45x = 8 cose 
dt dt 


З dx 
subject to x 2 — = 0 аіѓ= 0 
j di 


2 
(b) or re = 
t t 


subject to x = 1 and dx = =й 
dt 
(a) Find the inverse Laplace transform of 


1 
(s+ 1)(s +2)(s’ +25 +2) 


(b) A voltage source Ve“sint is applied across a 
series LCR circuit with L = 1, R = 3 and C = s ; 


Show that the current i(f) in the circuit 
satisfies the differential equation 


2; Р 
di | Adi 


A | +2i = V e” sint 
dt t 


Find the current i(f) in the circuit at time 
t => 0 if i(f) satisfies the initial conditions 
i(0) = 1 and (di/dt)(0) = 2. 


Use Laplace transform methods to solve the 
simultaneous differential equations 


2 
dx 45M ay 
dt dt 
2 
Sa Ер 
dt dt 
: dx _ dy 
bject t == == = = tt=0. 
subject to x 2 y IT 0a 0 
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4 


Solve the differential equation 


2 
Cx 28x Lo. - cos ¢ 
dt dt 

subject to the initial conditions x = x) and 9 


dx/dt = x, at t= 0. Identify the steady-state and 
transient solutions. Find the amplitude and phase 
shift of the steady-state solution. 


Resistors of 5 and 20 are connected to the 
primary and secondary coils of a transformer with 
inductances as shown in Figure 5.61. At time ¢ = 0, 
with no currents flowing, a voltage E 2 100 V 

is applied to the primary circuit. Show that 
subsequently the current in the secondary circuit is 


20 —! fe — =, 
(e (1+1) _ eml 4411/2) 


J41 10 


5Q 200 
[e [zi 


EC) 2H 3H 


M=1H Tm 


Figure 5.61 Circuit of Review exercise 5. 


(a) Find the Laplace transforms of 
(i) cos(@t+ ф) (ii) e "'sin (ot  $) 


(b) Using Laplace transform methods, solve the 
differential equation 
2 ^ 
л = cos 2t 
dt dt 


given that x = 2 and dx/dt= 1 when t=0. 
(a) Find the inverse Laplace transform of 
s-4 
s^ As 4 13 
(b) Solve using Laplace transforms the differential 12 
equation 
2 + 2y = 2(2 * cost t 2sin f) 
given that y 2 —3 when t - 0. 
Using Laplace transforms, solve the simultaneous 
differential equations 


e + 5x + 3y =Ssint—2 cost 


g + 3y + 5x = 6 sint — 3 cos t 


where x 7 1 and y 2 0 when t - 0. 


The charge q on a capacitor in an inductive circuit 
is given by the differential equation 


2 
99 430090 +2 х 10% = 200 sin 100r 
dt dt 

and it is also known that both q and dq/dr are zero 
when ¢ = 0. Use the Laplace transform method to 
find q. What is the phase difference between the 
steady-state component of the current dg/dt and 
the applied emf 200 sin 100: to the nearest 
half-degree? 


Use Laplace transforms to find the value of x 
given that 


49 + бх +у= 2 5іп 21 


2 
а = 
dt dt 


and that x = 2 and dx/dt = —2 when t= 0. 


(a) Use Laplace transforms to solve the 
differential equation 
2 
46 4 998 +169 = sin 21 
dt dt 


given that 9= 0 and d@/dt = 0 when ¢ = 0. 


(b) Using Laplace transforms, solve the 
simultaneous differential equations 


di Я : 
Gt Zit 6 = 0 


udo 
H~ 
ni ores 
given that i; 2 1, = 0 when t 2 0. 


The terminals of a generator producing a voltage 
V are connected through a wire of resistance 
Rand a coil of inductance L (and negligible 
resistance). A capacitor of capacitance C 
is connected in parallel with the resistance 
R as shown in Figure 5.62. Show that the 
current i flowing through the resistance R is 
given by 
2. ] 
po E 
dt dt 


[rec 
v( ) C 


Figure 5.62 Circuit of Review exercise 12. 


Suppose that 


(i) V=0 fort <0 and V=E (constant) for t = 0 
(u) Jg 
(iii) CR = 1/2n 
and show that the equation reduces to 
2. E 
Е + ot +2n'i = zu 
dt dt 
Hence, assuming that i = 0 and di/dt= 0 when 


t = 0, use Laplace transforms to obtain an 
expression for i in terms of t. 


Show that the currents in the coupled circuits of 
Figure 5.63 are determined by the simultaneous 
differential equations 





Figure 5.63 Circuit of Review exercise 13. 


di ND Я 
Lat Ri —i)+ Ri =E 
di А Е: 
Lt Rb -R(i,-i)-0 
Find 7, in terms oft, L, E and R, given that i, 2 0 and 
di,/dt = E/L at t = 0, and show that i, ~ 2 E/R for 
large t. What does i, tend to for large t? 


A system consists of two unit masses lying in a 
straight line on a smooth surface and connected 
together to two fixed points by three springs. When 
a sinusoidal force is applied to the system, the 
displacements x,(f) and x;(f) of the respective 
masses from their equilibrium positions satisfy 

the equations 
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ах; ; 
pM OE 2x, t sin 2t 
dt 

dx 

=: = —2x, + x; 

dt 


Given that the system is initially at rest in the 
equilibrium position (x, = x, = 0), use the Laplace 
transform method to solve the equations for x,(f) 
and x;(t). 


(a) Obtain the inverse Laplace transforms of 


5 +4 (ii SER 
5 +25 +10 (s- 1 (s-2) 
(b) Use Laplace transforms to solve the 
differential equation 


2 
CY 420 «y Lie 
df 


given that y 2 4 and dy/dt = 2 when t = 0. 


(a) Determine the inverse Laplace transform of 


9 
s?- 14s 4-53 


(b) The equation of motion of the moving coil 
of a galvanometer when a current i is passed 
through it is of the form 

gO 4540 4,29 = т 

d? dt K 
where 0 is the angle of deflection from the 
*no-current? position and n and K are positive 
constants. Given that 7 is a constant and 
0 = 0 = dO/dt when t— 0, obtain an expression 
for the Laplace transform of 0. 

In constructing the galvanometer, it is desirable 
to have it critically damped (that is, n = K). 
Use the Laplace transform method to solve the 
differential equation in this case, and sketch the 
graph of 0 against f for positive values of t. 


(a) Given that œ is a positive constant, use the 
second shift theorem to 


(i) show that the Laplace transform of 
sin t H(t — œ) is 
-as cos A +s sin 
x 2 
S arl 
(ii) find the inverse transform of 


-05 
se 


sy +2545 
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(b) Solve the differential equation 


2 
бу 20, 5y = sint — sint H(t — T) 
ағ ағ 


given that y 2 dy/dt 2 0 when t — 0. 


Show that the Laplace transform of the 
voltage v(t), with period T, defined by 


(Оет 
= | | x v(t+T)=v(t) E 
= Лл =; 
18 
1 1 = CBee 
Vis) = -—— 
(s) 5 1 стт 


This voltage is applied to a capacitor of 100 UF and 
a resistor of 250 Q in series, with no charge initially 
on the capacitor. Show that the Laplace transform 
I(s) of the current i(t) flowing, for t = 0, is 


—sT/2. 
е 


1 ES 


I = = ——— 
(8) = 3505 40) 1 «e 77 


and give an expression, involving Heaviside step 
functions, for i(f) where 0 « t « 2T. For T - 10?s, 
is this a good representation of the steady-state 
response of the circuit? Briefly give a reason for 
your answer. 


The response x(t) of a control system to a forcing 22 
term u(f) is given by the differential equation 


diede 


ae +2х = u(t) (t=0) 
t t 


Determine the impulse response of the system, and 
hence, using the convolution integral, obtain the 
response of the system to a unit step u(t) = 1 H(t) 
applied at ¢ = 0, given that initially the system is in 
a quiescent state. Check your solution by directly 
solving the differential equation 


2 
л, 
df dt 


with x = dx/dt — 0 at t — 0. 23 


A light horizontal beam, of length 5 m and constant 
flexural rigidity EI, built in at the left-hand end 
x — 0, is simply supported at the point x = 4m and 
carries a distributed load with density function 


12kNm 
24kNm | 


(0 <x < 4) 


me = | 
(4<x <5) 


Write down the fourth-order boundary-value 
problem satisfied by the deflection y(x). Solve this 
problem to determine y(x), and write down the 
resulting expressions for y(x) for the cases 0 = х 
< 4 and 4 € x < 5. Calculate the end reaction and 
moment by evaluating appropriate derivatives of 
y(x) at x = 0. Check that your results satisfy the 
equation of equilibrium for the beam as a whole. 


(a) Sketch the function defined by 
6 (=7= 1) 
ПОЕ АО 0) 
0 @>2) 


Express f(#) in terms of Heaviside step 
functions, and use the Laplace transform to 
solve the differential equation 


dx 

tx = f(t 

qu f(t) 
given that x 2 0 at t — 0. 

(b) The Laplace transform /(s) of the current i(7) 
in a certain circuit 1s given by 

ER 

s[Ls + R/(1 + Cs)] 


where Е, L, R and C are positive constants. 
Determine (i) lim i(¢) and (ii) lim 00). 
10 1° 


I(s) = 


Show that the Laplace transform of the half- 
rectified sine-wave function 


TE |. t 
0 


of period 27, is 

алата 

(1+52)(1-е") 
Such a voltage vw(f) is applied to a 1 €) resistor and 
a 1 H inductor connected in series. Show that the 
resulting current, initially zero, is 2. f(t — nm), 
where f(f) = (sint — cost + e ‘)H(t). Sketch a 
graph of the function f (^). 


(0<t 


(T. 


T) 
21р) 


= 
= 


(a) Find the inverse Laplace transform of 
1/s*(s + 1) by writing the expression in 
the form (1/s*)[1/(s + 1)’] and using the 
convolution theorem. 

(b) Use the convolution theorem to solve the 
integral equation 


у@=і+ | y (u) cos(t — u) du 


0 


24 


25 


26 


and the integro-differential equation 


| y" (u) y'(t — u) du — y()) 


0 


where y (0) = 0 and y'(0) 2 y,. Comment on the 
solution of the second equation. 


A beam of negligible weight and length 3/ carries a 


point load W at a distance / from the left-hand end. ГА 

Both ends аге clamped horizontally at the same 

level. Determine the equation governing the 

deflection of the beam. If, in addition, the beam 

is now subjected to a load per unit length, w, 

over the shorter part of the beam, what will then 

be the differential equation determining the 

deflection? 

(a) Using Laplace transforms, solve the 
differential equation 

D 
dx 30% 43x =H(t-a) (а>0) 
dt dt 
where H(t) is the Heaviside unit step function, 
given that x = 0 and dx/dt = 0 at t = 0. 

(b) The output x(f) from a stable linear control 
system with input sin wf and transfer function 
G(s) is determined by the relationship 

X(s) = G(s) L{sin wt} 
where X(s) 2 V(x(t)). Show that, after a long 
time ft, the output approaches x,(4), where 
jot ; 
а аа) 
ј 28 


Consider the feedback system of Figure 5.64, where 
K is a constant feedback gain. 


G(s) 





Figure 5.64 Feedback system of Review 
exercise 26. 


(a) Inthe absence of feedback (that is, K = 0) is the 
system stable? 

(b) Write down the transfer function G,(s) for the 
overall feedback system. 
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(c) Plot the locus of poles of G,(s) in the s plane 
for both positive and negative values of K. 

(d) From the plots in (c), specify for what 
range of values of K the feedback system is 
stable. 

(e) Confirm your answer to (d) using the 
Routh-Hurwitz criterion. 


(a) For the feedback control system of 
Figure 5.65(a) it is known that the impulse 
response is /(f) 2 2 e "sin t. Use this to 
determine the value of the parameter o. 

(b) Consider the control system of Figure 5.65(b), 
having both proportional and rate feedback. 
Determine the critical value of the gain K for 
stability of the closed-loop system. 


G(s) 








(b) 


Figure 5.65 Feedback control systems of 
Review exercise 27. 


A continuous-time system is specified in 
state-space form as 


X(t) = Ax(t) + bu(t) 
y(t) = e'x(t) 


where 


«i 4p of 


(a) Draw a block diagram to represent the 
system. 

(b) Using Laplace transforms, show that 
the state transition matrix is given by 


NL ` с ос Ge 0 2 


-3{ -2t -3t -2t 
се Bic e 
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(c) Calculate the impulse response of the system, 
and determine the response y(t) of the system to 
an input u(t) = 1 (t = 0), subject to the initial 
state x(0) 2 [1 0]. 


A single-input-single-output system is represented 
in state-space form, using the usual notation, as 


X(t) = Ax(f)  bu(t) 
(t) 7 e'x(t) 


For 


show that 


ez e (cos f - sin f) -e sini 
2e'sint e ‘(cos ¢ +sin f) 
and find x(7) given the x(0) = 0 and u(t) = 1 (t= 0). 
Show that the Laplace transfer function of the 
system is 


Y(s = 

His) = 2 = e(s1- Ay'b 

©) = Gog = Elst A) 

and find H(s) for this system. What is the system 
impulse response? 


A controllable linear plant that can be 
influenced by one input u(t) is modelled by 
the differential equation 


X(t) = Ax(t) + bu(t) BE 
where x(f) 2 pa(f) xÀ LORE 
the state vector, A is a constant matrix with 
distinct real eigenvalues Ду, 45, . . . , A„ and 
DIES b,]' is a constant vector. 
By the application of the feedback control 


u(f) 2 Kvgx(f) 


where vx is the eigenvector of A‘ corresponding 
to the eigenvalue A, of A‘ (and hence of A), the 
eigenvalue A, can be changed to a new real value px 
without altering the other eigenvalues. To achieve 
this, the feedback gain K is chosen as 


к= 05-х 
Рк 


where px = vxb. 


Show that the system represented by 


EET 0 
x(n-|0 -1 Olx()-* 1|) 
=з -3 2 0 


is controllable, and find the eigenvalues and 
corresponding eigenvectors of the system matrix. 
Deduce that the system is unstable in the absence 
of control, and determine a control law that will 
relocate the eigenvalue corresponding to the 
unstable mode at the new value —5. 


A second-order system is modelled by the 
differential equations 


Xjt2x,—4x,-u 
X-X, =U 

coupled with the output equation 
ус 


(a) Express the model in state-space form. 


(b) Determine the transfer function of the system 
and show that the system is unstable. 


(c) Show that by using the feedback control law 
u(t) = r(t) — kyt) 


where k is a scalar gain, the system will be 
stabilized provided k > 2. 


(d) If r(r) 2 H(t), a unit step function, and k > 2 
show that y(f) — 1 as t — ee if and only if k — 2 Р 


(An extended problem) The transient response 
of a practical control system to a unit step input 
often exhibits damped oscillations before reaching 
steady state. The following properties are some 
of those used to specify the transient response 
characteristics of an underdamped system: 


rise time, the time required for the response 
to rise from 0 to 100% of its final value; 

peak time, the time required for the response 
to reach the first peak of the overshoot; 

settling time, the time required for the response 
curve to reach and stay within a range about 
the final value of size specified by an absolute 
percentage of the final value (usually 2% or 
5%); 

maximum overshoot, the maximum peak 
value of the response measured from unity. 


SS 





Figure 5.66 Feedback control system of Review 
ехегсїве 32 


Consider the feedback control system of 
Figure 5.66 having both proportional and 
derivative feedback. It is desirable to choose the 
values of the gains K and K, so that the system 
unit step response has a maximum overshoot of 
0.2 and a peak time of 1 s. 


(a) Obtain the overall transfer function of the 
closed-loop system. 

(b) Show that the unit step response of the system, 
assuming zero initial conditions, may be 
written in the form 


-®6 
4) = ioe COS Mgt + 
ү 


where @, = @,\(1 — &2), 02 = Капа 
2@,ё = 1 + КК\. 


(c) Determine the values of the gains K and K, so 
that the desired characteristics are achieved. 

(d) With these values of K and K;, determine the 
rise time and settling time, comparing both the 
2% and 5% criteria for the latter. 


(An extended problem) The mass M, of the 
mechanical system of Figure 5.67(a) is subjected to 
a harmonic forcing term sin of. Determine the 
steady-state response of the system. 





Figure 5.67 Vibration absorber of 
Review exercise 33. 
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It is desirable to design a vibration absorber to 
absorb the steady-state oscillations so that in the 
steady state x(t) = 0. To achieve this, a secondary 
system is attached as illustrated in Figure 5.67(b). 


(a) Show that, with an appropriate choice of M, 
and K,, the desired objective may be achieved. 

(b) What is the corresponding steady-state 
motion of the mass №? 

(c) Comment on the practicality of your design. 


(An extended problem) The electronic amplifier 
of Figure 5.68 has open-loop transfer function G(s) 
with the following characteristics: a low-frequency 
gain of 120dB and simple poles at 1 MHz, 10 MHz 
and 25 MHz. It may be assumed that the amplifier 
is ideal, so that K(1 + КВ) = 1/3, where f is 

the feedback gain and K the steady-state gain 
associated with G(s). 


Input 





Figure 5.68 Electronic amplifier of Review 
exercise 34. 


(a) Construct the magnitude versus log frequency 
and phase versus log frequency plots (Bode 
plots) for the open-loop system. 

(b) Determine from the Bode plots whether or 
not the system is stable in the case of unity 
feedback (that is, B = 1). 

(c) Determine the value of f for marginal stability, 
and hence the corresponding value of the 
closed-loop low-frequency gain. 

(d) Feedback is now applied to the amplifier to 
reduce the overall closed-loop gain at low 
frequencies to 100 dB. Determine the gain 
and phase margin corresponding to this 
closed-loop configuration. 

(e) Using the given characteristics, express G (s) 
in the form 


K 


EA WINTER RENT 


and hence obtain the input—output transfer 
function for the amplifier. 

(f) Write down the characteristic equation for the 
closed-loop system and, using the Routh— 
Hurwitz criterion, reconsider parts (b) and (c). 


Y6 


Chapter 6 
6.1 
6.2 
6.3 
6.4 
6.5 
6.6 
6.7 
6.8 
6.9 

6.10 
6.11 


6.12 


The z Transform 


Contents 

Introduction 

The z transform 

Properties of the z transform 

The inverse z transform 

Discrete-time systems and difference equations 

Discrete linear systems: characterization 

The relationship between Laplace and z transforms 
Solution of discrete-time state-space equations 
Discretization of continuous-time state-space models 
Engineering application: design of discrete-time systems 
Engineering application: the delta operator and the 9€ transform 


Review exercises (1-18) 





482 
483 
488 
494 
502 
509 


529 


530 
538 
544 
547 


554 


482 THEZ TRANSFORM 


6.1 


Introduction 


In this chapter we focus attention on discrete-(time) processes. With the advent of fast 
and cheap digital computers, there has been renewed emphasis on the analysis and 
design of digital systems, which represent a major class of engineering systems. The 
main thrust of this chapter will be in this direction. However, it is a mistake to believe 
that the mathematical basis of this area of work is of such recent vintage. The first 
comprehensive text in English dealing with difference equations was The Treatise of 
the Calculus of Finite Differences by George Boole and published in 1860. Much of the 
early impetus for the finite calculus was due to the need to carry out interpolation and 
to approximate derivatives and integrals. Later, numerical methods for the solution of 
differential equations were devised, many of which were based on finite difference 
methods, involving the approximation of the derivative terms to produce a difference 
equation. The underlying idea in each case so far discussed is some form of approx- 
imation of an underlying continuous function or continuous-time process. There are 
situations, however, where it is more appropriate to propose a discrete-time model from 
the start. 

Digital systems operate on digital signals, which are usually generated by sampling 
a continuous-time signal, that is a signal defined for every instant of a possibly infinite 
time interval. The sampling process generates a discrete-time signal, defined only at 
the instants when sampling takes place so that a digital sequence is generated. After 
processing by a computer, the output digital signal may be used to construct a new 
continuous-time signal, perhaps by the use of a zero-order hold device, and this in 
turn might be used to control a plant or process. Digital signal processing devices 
have made a major impact in many areas of engineering, as well as in the home. For 
example, compact disc players, which operate using digital technology, offer such 
a significant improvement in reproduction quality that recent years have seen them 
rapidly take over from cassette tape players and vinyl record decks. DVD players 
are taking over from video players and digital radios are setting the standard for 
broadcasting. Both of these are based on digital technology. 

We have seen in Chapter 5 that the Laplace transform was a valuable aid in the 
analysis of continuous-time systems, and in this chapter we develop the z transform, 
which will perform the same task for discrete-time systems. We introduce the transform in 
connection with the solution of difference equations, and later we show how difference 
equations arise as discrete-time system models. 

The chapter includes two engineering applications. The first is on the design of 
digital filters, and highlights one of the major applications of transform methods as 
a design tool. It may be expected that whenever sampling is involved, performance will 
improve as sampling rate is increased. Engineers have found that this is not the full 
story, and the second application deals with some of the problems encountered. This 
leads on to an introduction to the unifying concept of the 2 transform, which brings 
together the theories of the Laplace and z transforms. 
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The z transform 


6.2.1 


Since z transforms relate to sequences, we first review the notation associated with 
sequences, which were considered in more detail in Chapter 7 of Modern Engineering 
Mathematics. A finite sequence {x,}5 is an ordered set of n + 1 real or complex 
numbers: 


{хк% = {Xo Xp X» ss x,j 


Note that the set of numbers is ordered so that position in the sequence is important. 
The position is identified by the position index k, where k is an integer. If the number 
of elements in the set is infinite then this leads to the infinite sequence 


{хк} = {Xo 1,0, ... } 


When dealing with sampled functions of time 1, it is necessary to have a means of 
allowing for t « 0. To do this, we allow the sequence of numbers to extend to infinity 
on both sides of the initial position x), and write 


Uu = { X05 Xs Xo) X1, X5... j 


Sequences [x,1 7, for which x, = 0 (k « 0) are called causal sequences, by analogy 
with continuous-time causal functions f(t)H(t) defined in Section 5.2.1 as 


0 (t<0) 


t)H(t) = 
а D (#2 0) 


While for some finite sequences it is possible to specify the sequence by listing all the 
elements of the set, it is normally the case that a sequence is specified by giving a 
formula for its general element x;. 


Definition and notation 


The z transform of a sequence {x,}"., is defined in general as 


= 5, 
k 
[pon 


Fix} = X(z) = (6.1) 


whenever the sum exists and where z is a complex variable, as yet undefined. 


The process of taking the z transform of a sequence thus produces a function 
of a complex variable z, whose form depends upon the sequence itself. The symbol 
Z denotes the z-transform operator; when it operates on a sequence {x,} it trans- 
forms the latter into the function X(z) of the complex variable z. It is usual to refer 
to {x,}, X(z) as a z-transform pair, which is sometimes written as {x,} © X(z). 
Note the similarity to obtaining the Laplace transform of a function in Section 5.2.1. 
We shall return to consider the relationship between Laplace and z transforms in 
Section 6.7. 
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For sequences {x,}*., that are causal, that is 
=O (e<O) 

the z transform given in (6.1) reduces to 

Xk 


Е 
Z 


Xp - X(z) = (6.2) 


M: 


> 
Il 


0 


In this chapter we shall be concerned with causal sequences, and so the definition 
given in (6.2) will be the one that we shall use henceforth. We shall therefore from now 
on take {x,} to denote {x,}9 . Non-causal sequences, however, are of importance, and 
arise particularly in the field of digital image processing, among others. 


Example 6.1 Determine the z transform of the sequence 
(х) = (28) (62 0) 


Solution From the definition (6.2), 
= pk -(2NE 
zo -Y5-Y 
2 к=0 


k-0 
which we recognize as a geometric series, with common ratio r — 2/z between successive 
terms. The series thus converges for |z| > 2, when 


SE) up to). ood 
кэе | -2/z 1-2/z 











2 

k-0 
leading to 

29 = 2 (Iz| 2) (6.3) 

2- 2 

so that 

ix 2 0 

X(z) » — 

(z) Z> 


is an example of a z-transform pair. 


From Example 6.1, we see that the z transform of the sequence {2%} exists provided 
that we restrict the complex variable z so that it lies outside the circle |z| = 2 in the 
z plane. From another point of view, the function 





X()-—— (2) 
22 
may be thought of as a generating function for the sequence (2^, in the sense that the 
coefficient of z* in the expansion of X(z) in powers of 1/z generates the kth term of 
the sequence {2}. This can easily be verified, since 


Example 6.2 


Solution 
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2 (1-2) 
2-2 1-2/z 2 


and, since |z| > 2, we can expand this as 


= 2 k 
(1-2) =1424(2)4.4(2) 4... 
2 2 2 2 


and we see that the coefficient of z * is indeed 2^, as expected. 
We can generalize the result (6.3) in an obvious way to determine (a^), the z trans- 
form of the sequence (a^), where a is a real or complex constant. At once 





g (a^) du end == (011 >14) 








so that 
Ha'}=—— (121 > 14]) (6.4) 
Show that 
2 
919% 277 du 5 


1 


Taking a = 75 in (6.4), we have 








gt C» - ro 2 ER UP 
so that 
22 1 
HODI =z (z>) 


Further z-transform pairs can be obtained from (6.4) by formally differentiating 
with respect to a, which for the moment we regard as a parameter. This gives 


d yr e g] dat = 2 ) 
Satay = 2d} - 8 z-a 


leading to 








Zitat) =- =— (\z| > lal) (6.5) 
D 


y 
In the particular case a = | this gives 


2 


Hk} = = (12121) (6.6) 
(z - 1) 
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Example 6.3 


Solution 


Find the z transform of the sequence 


{2k} = {0, 2, 4, 6, 8, ...} 


From (6.6), 


_ 2 


ЭЦ} = 90,1,2,3,...}= У c 
ий 


k=0 


М [27 


Using the definition (6.1), 


2 4 6 8 AN 
2[0,2,4,86 $2 eU 4 oer uud 


2 7 2 2 


Ne 


k=0 
so that 


2z 
(z- 1) 





94201 = 2958} = (6.7) 


Example 6.3 demonstrates the ‘linearity’ property of the z transform, which we shall 
consider further in Section 6.3.1. 
A sequence of particular importance is the unit pulse or impulse sequence 


{б} = {1} = {1, 0,0, ...} 
It follows directly from the definition (6.4) that 
F{6,3 = 1 (6.8) 


In MATLAB, using the Symbolic Math Toolbox, the z-transform of the sequence 
{x,} is obtained by entering the commands 


syms k z 
ztrans(x.) 


As for Laplace transforms (see Section 5.2.2), the answer may be simplified using 
the command simple(ans) and reformatted using the pretty command. Con- 
sidering the sequence {x,} = {2"} of Example 6.1, the commands 


syms k z 
Бас (о) 


return 
nas Endo I A QN ED ENT) 
Entering the command 
simple (ans) 
returns 


eumgem/ (m-2) 


6.2.2 


Figure 6.1 Sampling 
of a continuous-time 
signal. 


Example 6.4 


Solution 
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z transforms can be performed in MAPLE using the ztrans function; so the 
commands: 
каис (ОЕ О у 
simplify ($£); 
return 


p 
DE 





Sampling: a first introduction 


Sequences are often generated in engineering applications through the sampling of 
continuous-time signals, described by functions f(t) of a continuous-time variable f. 
Here we shall not discuss the means by which a signal is sampled, but merely suppose 
this to be possible in idealized form. 


f(kT) у=) 


ann 
.* 





> 
о T 2T 3T AT ST 6T kT t 


Figure 6.1 illustrates the idealized sampling process in which a continuous-time 
signal /(f) is sampled instantaneously and perfectly at uniform intervals T, the sampling 
interval. The idealized sampling process generates the sequence 


UET) = 4/00), (Т), Д2Т),....,/пТ),... (6.9) 


Using the definition (6.1), we can take the z transform of the sequence (6.9) to give 


= (КТ 
raD} = Y AD (6.10) 
Eo 7 
whenever the series converges. This idea is simply demonstrated by an example. 


The signal f(f) 2 e *H(t) is sampled at intervals T. What is ће z transform of the resulting 
sequence of samples? 


Sampling the causal function f(t) generates the sequence 


UT); — UO), (TD), fT), fn), ... ] 


-T ,-2T ,-3T -nT 
sudes Ve gE esed 
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Then, using (6.1), 


-kT 


Lf (T) =Y S 


2 





k=0 k=0 
so that 
He = (z| >e 
Z= 


D 


2 


(6.11) 


It is important to note in Example 6.4 that the region of convergence depends on the 


sampling interval T. 


In MATLAB the commands 
syms k T z 


= 


ztrans (exp(-k*T)); 
pretty (simple(ans)) 
return 


ans 1 (аро (10) 
which confirms (6.11). 
In MAPLE the commands: 


ztrans(exp(-k*m) kx) 7 


арау) 


return 


T 
ze 





E 
ze-1 


6.2.3 Exercises 


2 


1 Calculate the z transform ofthe following sequences, 
stating the region of convergence in each case: 


(a) (005 (b) G5 D$ 
(d (-Q5) © Bh 


The continuous-time signal f(t) 2 e ?^', where c is 


a real constant, is sampled when ¢ = 0 at intervals 
T. Write down the general term of the sequence 
of samples, and calculate the z transform of the 
sequence. 


Properties of the z transform 


In this section we establish the basic properties of the z transform that will enable us to 
develop further z-transform pairs, without having to compute them directly using the 


definition. 


6.3.1 


Example 6.5 


Solution 
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The linearity property 


As for Laplace transforms, a fundamental property of the z transform is its linearity, 
which may be stated as follows. 


If {x,} and {y,} are sequences having z transforms X(z) and Y(z) respectively and if 
à and f are any constants, real or complex, then 


Z (ax, * By} = aF{x,} + BE y,} = aX) + BY) (6.12) 


As a consequence of this property, we say that the z-transform operator & is a linear 
operator. A proof of the property follows readily from the definition (6.4), since 


со 


OX, + Ax A Vr 
dox, * В) = У RED = оу % +рў % 
2 k-0 7 k-0 7 


= aX(z) + BY(z) 


The region of existence of the z transform, in the z plane, of the linear sum will be the 
intersection of the regions of existence (that is, the region common to both) of the 
individual z transforms X(z) and Y(z). 


The continuous-time function f(t) = cos wt H(t), œ a constant, is sampled in the ideal- 
ized sense at intervals T to generate the sequence {cos ХОТ). Determine the z transform 
of the sequence. 


Using the result соз КОТ = 2 (е7 + е7) апа ће linearity property, we have 
Ecos KOT} = 01 ейт 1ean) = 1 Feiten) + 1 Ffoton), 
Using (6.7) and noting that |e"^^"| 2 |e?^7| 2 1 gives 


2 +1 2 
jor 2 јот 
Z= 





F{cos k@T} = } (Iz| » D 


2-е 


-joT joT 

z(z- e) z(z- e^ 
2 joT —joT. 

z -(e" не! )2+1 


Nie 


leading to the z-transform pair 


&[coskoT) - 2-999 9T) yz) > 1) (6.13) 


2°- 22 соз @Т+1 


In a similar manner to Example 6.5, we can verify the z-transform pair 


FisinkoT} =- LOT Lr aao 1) (6.14) 


zi -2zcos oT 41 


and this 1s left as an exercise for the reader (see Exercise 3). 
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E] 


6.3.2 


Check that in MATLAB the commands 


syms kz o T 
ztrans (cos (k*@*T) ) ; 
pretty (simple(ans) ) 


return the transform given in (6.13) and that the MAPLE commands: 


рате (сос (Б^л) ЕЛ УЕ 
Sees Л) 


do likewise. 


The first shift property (delaying) 


In this and the next section we introduce two properties relating the z transform of a 
sequence to the z transform of a shifted version of the same sequence. In this section 
we consider a delayed version of the sequence {x,}, denoted by {y,}, with 


У Хь 


Here Kk, is the number of steps in the delay; for example, if kj 2 2 then y, 2 x, ;, 
so that 


JYo-— X  Yi—X-p o Ya— Xo V3 =X 


and so on. Thus the sequence {y,} is simply the sequence {x,} moved backward, or 
delayed, by two steps. From the definition (6.1), 


DN YES NS UAR р 
Bint = a=), zk m zP*ho 


k=0 k=0 p=-k, 


where we have written p = k — ky. If {x,} is a causal sequence, so that x, = 0 (p < 0), 
then 


X ПЕ. 1 

Z _ po 

A=) eee сшщ 
p=0 


M: 


1 
о 


р: 


where X(z) is the z transform of {x}. 
We therefore have the result 


Fand Io gu) (6.15) 


which is referred to as the first shift property of z transforms. 

If {x,} represents the sampled form, with uniform sampling interval 7, of the con- 
tinuous signal x(t) then {x,_,, } represents the sampled form of the continuous signal 
x(t — ky T) which, as illustrated in Figure 6.2, is the signal x(t) delayed by a multiple 
ky of the sampling interval T. The reader will find it of interest to compare this result 
with the results for the Laplace transforms of integrals (5.16). 
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Figure 6.2 
Sequence and its 
shifted form. 


Example 6.6 


Solution 


6.3.3 


{хк} 





O T2T3T nT 1 o koT (ky +0)T t 


The causal sequence {x,} is generated by 
x-Y (0) 


Determine the z transform of the shifted sequence {x,_,}. 


By the first shift property, 
1 13 
2х2} = 73 EH 5) } 
2 


which, on using (6.4), gives 


I 1 2 2 
ia ФАГ еә 
2 


z-1 222-1 (22-1) 


2 


(1212 5) 


We can confirm this result by direct use of the definition (6.1). From this, and the fact 
that {x,} is a causal sequence, 


{jot = 1K 55 X45 Noy Xsan fH 40, 0,1, 1,1,.4] 


Thus, 
s 20c0e le e Ease Ma Lu] 
г 2z 42 2 2z 4z 
1 
=- (Ilp»5 2— (>) 
Zz-i z(2z - 1) 


The second shift property (advancing) 


In this section we seek a relationship between the z transform of an advanced version 
of a sequence and that of the original sequence. First we consider a single-step 
advance. If {y,} is the single-step advanced version of the sequence {x,} then {y,} is 
generated by 


Ve=Xnr (k= 0) 
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6.3.4 


Then 








M: 


ec yr Xp ec Xu 

an} =SB= =: 
zk zk gie 

k=0 k=0 


k: 


1 
о 


and putting p = k + | gives 


AX X X 
Fiyi = zy Е. = 45 a s — zX(z) - zxo 
р=1 р=0 


where X(z) is the z transform of (x,j. 
We therefore have the result 


1х} —zX(z) — zxg (6.16) 
In a similar manner it is readily shown that for a two-step advanced sequence {x;,,} 
хо) = 22 (о) — 2°Xq — 2X (6.17) 


Note the similarity in structure between (6.16) and (6.17) on the one hand and those for 
the Laplace transforms of first and second derivatives (Section 5.3.1). In general, it is 
readily proved by induction that for a ko-step advanced sequence {X;,;,} 


ko-1 
Sf xia.) = 2°°X(z) - У (6.18) 


п=0 


In Section 6.5.2 we shall use these results to solve difference equations. 


Some further properties 


In this section we shall state some further useful properties of the z transform, leaving 
their verification to the reader as Exercises 9 and 10. 


(i) Multiplication by a‘ 
If Z{x,} = X(z) then for a constant a 


Ft akx,} = X(az) (6.19) 


(ii) Multiplication by k" 
If Z{x,} = X(z) then for a positive integer n 


HER} = (z4) xo (6.20) 
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Note that in (6.20) the operator —z d/dz means ‘first differentiate with respect 
to z and then multiply by —z’. Raising to the power of n means ‘repeat the 
operation n times’. 


(iii) Initial-value theorem 


If {x,} is a sequence with z transform X(z) then the initial-value theorem states 
that 


бо) х (6.21) 
(iv) Final-value theorem 


If (x,) is a sequence with z transform X(z) then the final-value theorem states 
that 


lim x, = lim(1- z )X(z) (6.22) 


ke zl 


provided that the poles of (1 — z ')X(z) are inside the unit circle. 


6.3.5 Table of z transforms 


Figure 6.3 A short 
table of z transforms. 


It is appropriate at this stage to draw together the results proved so far for easy access. 
This is done in the form of a table in Figure 6.3. 

















{x,} (k = 0) EM Region of existence 
inm n il Allz 
OK 10) 

(unit pul se sequence) 

x, = 1 (unit step sequence) = О 1 
E 

x, = a‘ (a constant) ex. z| |а| 
iQ 

EE = IET 
(2-1)? 

x, = ka"! (a constant) Z z|>a 
(2-а) 

x, = &"” (T constant) 2 z| 2e 
С 

x, 7 COSKoT (c, T constants) EOS уш z»1 
z-2zcos oT +1 

x, 7 sinkoT (c, T constants) ——zsineT __ “|> dl 





z -2zcos of +1 
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6.3.6 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Use the method of Example 6.5 to confirm (6.14), 
namely 


z sin oT 


Fisin køæT} = Se 
z -2zcos @T+1 


where @ and T are constants. 


Use the first shift property to calculate the z 
transform of the sequence {y,}, with 


0 
Jk 
Xk-3 


where {x,} is causal and x, = G J. Confirm your 
result by direct evaluation of Zy,) using the 
definition of the z transform. 


(k < 3) 
(k= 3) 


Determine the z transforms of the sequences 


(а) ((-:))7 (b) {coskn} 


6 


10 


Sari The inverse z transform 


In this section we consider the problem of recovering a causal sequence {x,} from 
knowledge of its z transform X(z). As we shall see, the work on the inversion of Laplace 
transforms in Section 5.2.7 will prove a valuable asset for this task. 


Determine AG J^. Using (6.6), obtain the z 
transform of the sequence (K(1)*). 


Show that for a constant o 


(a) #{sinh ka} = = zsinha 

z -2zcosha+t+l 
2 

—z -zcosha | 


(b) ¥{cosh ka} = 5 
z -2zcosha+1 


Sequences are generated by sampling a causal 
continuous-time signal u(t) (t = 0) at uniform 
intervals T. Write down an expression for zu, the 
general term of the sequence, and calculate the 
corresponding z transform when u(f) is 


(a e" (b) sint (с) соѕ2/ 


Prove the initial- and final-value theorems given in 
(6.21) and (6.22). 


Prove the multiplication properties given in (6.19) 
and (6.20). 


Formally the symbol %~'[X(z)] denotes a causal sequence {x,} whose z transform is 


X(z); that is, 
if Zi) = X2) 


then. (x AAE] 


This correspondence between X(z) and {x,} is called the inverse z transformation, 
{x,} being the inverse transform of X(z), and Z^! being referred to as the inverse 


z-transform operator. 


As for the Laplace transforms in Section 5.2.8, the most obvious way of finding the 
inverse transform of X(z) is to make use of a table of transforms such as that given in 
Figure 6.3. Sometimes it is possible to write down the inverse transform directly from 
the table, but more often than not it is first necessary to carry out some algebraic manip- 
ulation on X(z). In particular, we frequently need to determine the inverse transform of 
a rational expression of the form P(z)/Q(z), where P(z) and Q(z) are polynomials in z. 
In such cases the procedure, as for Laplace transforms, is first to resolve the expression, 
or a revised form of the expression, into partial fractions and then to use the table of 
transforms. We shall now illustrate the approach through some examples. 


6.4.1 


Example 6.7 


Solution 


Example 6.8 


Solution 
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Inverse techniques 


Find 





From Figure 6.3, we see that z/(z — 2) is a special case of the transform z/(z — a), with 
a — 2. Thus 


«|| peg 
zc. an 


Find 
=] 2 
- Fe - Dac- B 


Guided by our work on Laplace transforms, we might attempt to resolve 
Y(z) 5 ——É——— 
(2- 1)(2- 2) 


into partial fractions. This approach does produce the correct result, as we shall show 
later. However, we notice that most of the entries in Figure 6.3 contain a factor z in the 
numerator of the transform. We therefore resolve 


Yo... 1 


2 (z- 1)(z-2) 


into partial fractions, as 








Ya l _ 1l 
2 z-2 27-1 
so that 
Ү(гу=—2—-—2 
О 2-2 27-1 


Then using the result ¥"[z/(z — a)] = {a*} together with the linearity property, we have 


-Qh-(ü5 (20 


so that 


E 2 ege 5 zi 
g е 13 (620) (6.23) 
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E 


Suppose that in Example 6.8 we had not thought so far ahead and we had simply 
resolved Y(z), rather than Y(z)/z, into partial fractions. Would the result be the same? 
The answer of course is ‘yes’, as we shall now show. Resolving 


LA LE 
МӘБ 210220) 


into partial fractions gives 


2 1 


2-2 27-1 








Y(z) = 


which may be written as 


it follows from the first shift property (6.15) that 
gi z] : e QU (0) 





[22-2 0 (k = 0) 
Similarly, 
PIER „ен (k > 0) 
|22- 1] 0 (k = 0) 


Combining these last two results, we have 


a" [Y] 2 2%" ; = =g ; 7 
2 


2—2, 22-1 


2 (К > 0) 
0 (К = 0) 


which, as expected, is in agreement with the answer obtained in Example 6.8. 

We can see that adopting this latter approach, while producing the correct result, 
involved extra effort in the use of a shift theorem. When possible, we avoid this by 
*extracting' the factor z as in Example 6.8, but of course this is not always possible, 
and recourse may be made to the shift property, as Example 6.9 illustrates. 


The inverse z-transform {x,} of X(z) is returned in MATLAB using the command 
iztrans (X(z) ,k) 


[Note: The command iztrans(X(z)) by itself returns the inverse transform 
expressed in terms of n rather than K.] 
For the z-transform in Example 6.8 the MATLAB command 


а ЕЕ) 


Example 6.9 


Solution 
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returns 
msc leaks 


as required. 
The inverse z-transform can be performed in MAPLE using the invztrans 
function, so that the command 


Ям кас Za S] eme E c He e 
also returns the answer 
2*7 —- 1 


Find 
-1 2z+1 
2 E 


In this case there is no factor z available in the numerator, and so we must resolve 


Y(z) = 2z+1 
(z+1)(z-3) 


into partial fractions, giving 


E E 


1 2 
2+1 2-3 * 


2+1 


2 


= 2-3 











+ 





AIN 


1 T 1 
A Z z 


Since 





e| m se (k 7 0) 


2+1 
Е = [3% (62 0) 
2-3 


it follows from the first shift property (6.15) that 


xl! A S d (k > 0) 
zz+1 0 (k=0) 


АЕ = _ A (k > 0) 
22-3 0 (k=0) 


Then, from the linearity property, 


-1 igl l Z 1gg||l oz 
a плаз Eus Es 
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Example 6.10 


Solution 


giving 
yil 22:1 ] [CD y (G0) 
(z * 1)(z- 3) 0 (К = 0) 
In MATLAB the command 
iztrans ((2*2+1) /( (z+) * (2-3) ),k) 


returns 
а е л е ОИЕ) К ЕУ Л ОЕ 


[Note: The charfcn function is the characteristic function of the set А, and is defined 
to be 


1 if kis in A 


charfen[A](&) — 
Ee | if k 1s not in A 


Thus charfcn [0100 = 1 1# k 2 0 and 0 otherwise.] 
It is left as an exercise to confirm that the answer provided using MATLAB 
concurs with the calculated answer. 


It is often the case that the rational function P(z)/Q(z) to be inverted has a quadratic 
term in the denominator. Unfortunately, in this case there is nothing resembling the 
first shift theorem of the Laplace transform which, as we saw in Section 5.2.9, proved 
so useful in similar circumstances. Looking at Figure 6.3, the only two transforms with 
quadratic terms in the denominator are those associated with the sequences (cos koT) 
and (sin ko T]. In practice these prove difficult to apply in the inverse form, and a 
‘first principles’ approach is more appropriate. We illustrate this with two examples, 
demonstrating that all that is really required is the ability to handle complex numbers 
at the stage of resolution into partial fractions. 


Invert the z transform 


2 


2 2 
z +a 





Y(z) = 


where a is a real constant. 


In view of the factor z in the numerator, we resolve Y(z)/z into partial fractions, giving 


X2. 1 2. 000 L d a 
2 +a (z-ja)(z-ja) j2a(z-ja) j2a(z+ja) 
That is 
үе) = 4-[ 2 - 


j2a\z-ja ztja 


Example 6.11 


Solution 


6.4 THE INVERSE Z TRANSFORM 499 


Using the result Z '[z/(z — a)] 2 (a, we have 





a| Z | ={(0а)*} = {]а*} 


z - ja 





zi - | = {(-ја) р = {(—])'а'} 


z tja 
From the relation e? 2 cos @ + j sin 0, we have 
ј= еї, -j= e 


so that 





"| Z j^ (a (e?) = (al 71 2 La (cos exi jsin 1kn)] 


z - ja 





x Р a - (a' (cos Ikn-jsin ikm)] 


The linearity property then gives 
k 
Ж'[Ү()] = { (cos ikn +jsin ikr — cos tkn + jsin in| 


- [a^ ѕіп кп} 


Whilst MATLAB or MAPLE may be used to obtain the inverse z transform when 
complex partial fractions are involved, it is difficult to convert results into a simple 
form, the difficult step being that of expressing complex exponentials in terms of 
trigonometric functions. 


Invert 


Y(z) 2 4—— 
z -z+1 


The denominator of the transform may be factorized as 


?-z+1=(z-4-j$)(z-1+;8) 


In exponential form we have 5 + jį y3 = e?"?., so the denominator may be written as 
Z-z-1-(z- ez - e?) 


We then have 


Y). І 
z — (z-ei?y(z- e^?) 
which can be resolved into partial fractions as 
Y(z) _ EE ER 1 1 


jn/3 -jn/3 jn/3 T -jn/3 jn/3 -jn/3 
2 е -e 2-е е -е 2-6 
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Noting that sin 0 — (e? — e79)/2, this reduces to 


Using the result ¥"[z/(z — a)] 2 (a^), this gives 

- 1 jkT. -jkn | : 

Ф [Ү(2)] = a (e)? - e 397) 2 (2 IL sin 1n] 
y 


We conclude this section with two further examples, illustrating the inversion 
technique applied to frequently occurring transform types. 


Example 6.12 Find the sequence whose z transform is 


2x 41 
= 3 
2 


F(z) 


Solution F(z) is unlike any z transform treated so far in the examples. However, it is readily 
expanded in a power series in z'! as 


F(z) =14+244 
a g 
Using (6.4), it is then apparent that 
Z TIFE] = {ft = {1, 2, 0, 1, 0, 0,...3 


E) The MATLAB command 
aee aN a O eae сд E) 
returns 
charfcn[0](k)+2*charfcn[1] (k) +charfcn[3](k) 
which corresponds to the sequence 
ооо 


Example 6.13 ~— Find ¥"'[G(z)] where 


G() - —L- 8 — 
(z- D(z-e^) 


where a and T are positive constants. 


Solution Resolving into partial fractions, 


G..1. 1; 


-aT 
2 z-l z-e 


E] 
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giving 
1 1 








G(z) = 5 


2-1 2-е“ 


Using the result ¥"'[z/(z — a)] = {a"}, we have 

EG@)={d- e)} (kz 0) 
In this particular example G(z) is the z transform of a sequence derived by sampling the 
continuous-time signal 

f)- 1- ev 


at intervals 7. 


The MATLAB commands 


syms k za T 
iztrans((z*(1-exp(-a*T)))/((z-1)*(z-exp(-a*T))),k); 
pretty (simple(ans)) 

return 
ans-1-exp(-aT)* 

In MAPLE the command 
invztrans((z*(1-exp(-aT)))/((z-1)*(z-exp(-aT))),z,k); 


returns 


6.4.2 Exercises 














Confirm your answers using MATLAB or MAPLE whenever possible. 











11 Invert the following z transforms. Give the general (f) 2 (в) 22°-7:2 
term of the sequence in each case. Fa 2J3z+4 8 (z- Dc -3) 
z z EX 2 
1 Вт (с) 2-1 (В) ———————— 
2 > z (2- 1)2(22-2 + 1) 
(d) d (e) z-i (£) he 
і s0 FEIN 13 Find X"[Y(z)] when Y(z) is given by 
uy rd Ет 112 3 2 
(а) == (p) Dp em 
12 By fist resolving Y(zy/z into partial fractions, find Eod T: M 
Z^" [Y(z)] when Y(z) is given b 2 5 
[Y] (z) is g у (с) 32+2 152 (d) Liz, 3z 
(à) ———— (b) ———ÁÀ—— z ё 32+1 
(2-1)(2+2) (22+ 1)(2- 3) 3 2 2 
2 22 +62 + 52+ 1 f 22 -72+7 
(€) ———É——— 0 — 2 — er ui 0 6-Dc-2 
(22+1)(-1) 2242-1 оь Gat) 
(е) = Hint: 21 2G - D (в) 345—— 
2+1 2 -32+2 
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Discrete-time systems and difference equations 


6.5.1 


Figure 6.4 Discrete- 
time signal processing 
system. 


In Chapter 5 the Laplace transform technique was examined, first as a method for 
solving differential equations, then as a way of characterizing a continuous-time system. 
In fact, much could be deduced concerning the behaviour of the system and its pro- 
perties by examining its transform-domain representation, without looking for specific 
time-domain responses at all. In this section we shall discuss the idea of a linear 
discrete-time system and its model, a difference equation. Later we shall see that the 
z transform plays an analogous role to the Laplace transform for such systems, by 
providing a transform-domain representation of the system. 


Difference equations 


First we illustrate the motivation for studying difference equations by means of an 
example. 

Suppose that a sequence of observations {x,} is being recorded and we receive 
observation x, at (time) step or index k. We might attempt to process (for example, 
smooth or filter) this sequence of observations {x,} using the discrete-time feedback 
system illustrated in Figure 6.4. At time step k the observation x, enters the system as 
an input, and, after combination with the ‘feedback’ signal at the summing junction S, 
proceeds to the block labelled D. This block is a unit delay block, and its function is to 
hold its input signal until the ‘clock’ advances one step, to step k + 1. At this time the 
input signal is passed without alteration to become the signal y,,,, ће (+ 1)th member 
of the output sequence {y,}. At the same time this signal is fed back through a scaling 
block of amplitude o to the summing junction S. This process is instantaneous, and at 
S the feedback signal is subtracted from the next input observation x;,, to provide the 
next input to the delay block D. The process then repeats at each ‘clock’ step. 

To analyse the system, let (rj) denote the sequence of input signals to D; then, 
owing to the delay action of D, we have 


Viv = № 

Also, owing to the feedback action, 
T, — Xy — OVE 

where a is the feedback gain. Combining the two expressions gives 
Уел = Xy — Oy 

Or 


View + OVE = Xe (6.24) 


Equation (6.24) is an example of a first-order difference equation, and it relates adjacent 
members of the sequence {у} to each other and to the input sequence {x}. 


{хк} {л} lx 
7 р | 
(а) 


Example 6.14 


Figure6.5 Thesystem 
for Example 6.14. 


Solution 
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A solution of the difference equation (6.24) is a formula for y,, the general term of 
the output sequence {y,}, and this will depend on both & and the input sequence {x,} as 
well as, in this case, the feedback gain a. 


Find a difference equation to represent the system shown in Figure 6.5, having input 
and output sequences {x,} and {y,} respectively, where D is the unit delay block and a 
and b are constant feedback gains. 





Introducing intermediate signal sequences {r,} and {v,} as shown in Figure 6.5, at each 
step the outputs of the delay blocks are 


Уы = Vp (6.25) 

Vie = 7 (6.26) 
and at the summing junction 

T,— x,— aV, + by; (6.27) 
From (6.25), 

Уро = О 


which on using (6.26) gives 
Уњ = Fk 
Substituting for r, from (6.27) then gives 
ja 7 X — av, * by, 
which on using (6.25) becomes 
Уно = Xy — ауы + Бу, 
Rearranging this gives 
Уыз + Gyg4 — Dy, X, (6.28) 


as the difference equation representing the system. 


The difference equation (6.28) is an example of a second-order linear constant- 
coefficient difference equation, and there are strong similarities between this and a second- 
order linear constant-coefficient differential equation. It is of second order because the 
term involving the greatest shift of the {y,} sequence is the term in y,,5, implying a shift 
of two steps. As demonstrated by Example 6.14, the degree of shift, or the order of the 
equation, is closely related to the number of delay blocks in the block diagram. 
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6.5.2 


Example 6.15 


Solution 


The solution of difference equations 


Difference equations arise in a variety of ways, sometimes from the direct modelling of 
systems in discrete time or as an approximation to a differential equation describing the 
behaviour of a system modelled as a continuous-time system. We do not discuss this 
further here; rather we restrict ourselves to the technique of solution but examples of 
applications will be apparent from the exercises. The z-transform method is based upon 
the second shift property (Section 6.3.3), and it will quickly emerge as a technique 
almost identical to the Laplace transform method for ordinary differential equations 
introduced in Section 5.3.3. We shall introduce the method by means of an example. 


If in Example 6.14, a — 1, b — 2 and the input sequence (x,) is the unit step sequence 
{1}, solve the resulting difference equation (6.28). 


Substituting for a, b and {x,} in (6.28) leads to the difference equation 

Vert Ver -2y=1 (k= 0) (6.29) 
Taking z transforms throughout in (6.29) gives 

у +уы = 25) = 91, 1, 1,...} 
which, on using the linearity property and the result #{1} = z/(z — 1), may be written as 


уњ} + уһ} = 294 у} = — 


2-1 


Using (6.16) and (6.17) then gives 


z*YG) - z'y - ze] * EYG) - zw] - 2Y6) 7 — 
Е 
which on rearranging leads to 
c^-z-2)Y2)- с + 2?уу+ (у + у) (6.30) 
pun 


To proceed, we need some further information, namely the first and second terms y, and 
y, of the solution sequence {y,}. Without this additional information, we cannot find a 
unique solution. As we saw in Section 5.3.3, this compares with the use of the Laplace 
transform method to solve second-order differential equations, where the values of the 
solution and its first derivative at time ¢ = 0 are required. 
Suppose that we know (or are given) that 

»=0, мет 
Then (6.30) becomes 

(22 +2 2) (2) == + i. 

z-1 

Or 


2 


(z- 2c - 1)Y(z)=z+ 





2-1 
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and solving for Y(z) gives 


2 
2 2 2 


o Б (6.31) 
(2+2)(2-1) (2+2)(2- 1) (2+2)(2- 1) 


To obtain the solution sequence {y,}, we must take the inverse transform in (6.31). 
Proceeding as in Section 6.4, we resolve Y(z)/z into partial fractions as 


N20. —®  _ l 
2 (2+2)(2- 1) (2-1) z-i 2+2 








апа ѕо 





Using the results ¥'[z/(z — a)] = {a*} and ¥"'[z/(z — 1)?] = {k} from Figure 6.3, we 
obtain 


Dx = {3k +2- 2-2) (k= 0) 


as the solution sequence for the difference equation satisfying the conditions yọ = 0 
апа у; = 1. 


The method adopted in Example 6.15 is called the z-transform method for solving 
linear constant-coefficient difference equations, and is analogous to the Laplace 
transform method for solving linear constant-coefficient differential equations. 

To conclude this section, two further examples are given to help consolidate under- 
standing of the method. 


Such difference equations can be solved directly in MAPLE using the rsolve 
command. In the current version of the Symbolic Math Toolbox in MATLAB there 
appears to be no equivalent command for directly solving a difference equation. 
However, as we saw in Section 5.5.5, using the maple command in MATLAB 
lets us access MAPLE commands directly. Hence, for the difference equation in 
Example 6.15, using the command 

mailed es ote (Е) л Еву ЛА) 

=1,y(0)=0,y(1)=1},y(k))’) 
in MATLAB returns the calculated answer 

БОКСЕР) ОЕ Sk 
In MAPLE difference equations can be solved directly using rsolve, so that the 
command 

rsolve ({y (k+2) +y (k+1)-2*y (k)=1,y (0)=0,y(1)=1},y(k)); 


returns 
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Example 6.16 


Solution 


g 


Solve the difference equation 
BVa — буы + =9 (К)=0) 
given that y, = 1 and y, = 5. 


Taking z transforms 


BZ Vio} — OL ia} + Zit = 9Z{1} 
Using (6.16) and (6.17) and the result (1) — z/(z — 1) gives 


[z^ YG) - zy, - zv] - 6lzYG) - zw] * МӘ) = 22 





z-1 
which on rearranging leads to 
(82? — 6z  1)Y(z) ^ 8z?y, *- 8zy, — 6zyy - 22. 
z= 
We are given that y; — 1 and y, = 2, so 
2 _ 2 9z 
(8z^ — 6z 1)¥(z) = 82° + 6z + 
Z= 
or 
Yz)..— 846  , 9 
s (4-DOs-1) (ds-l)Gz- Diz- D 


EU. ice илсе 
@-De-) @-pe-DE-1) 
Resolving into partial fractions gives 


Y. 5 4,6 9, 3 





2 gez z-i 2 i z-i 2-1 
xd. Lb i e 
2-} z-i 2-1 
and so 
Yæ) = 22 - 42 4 32 
Бет. 295 2-1 


4 


Using the result ¥'{z/(z — a)? 2 (a^) from Figure 6.3, we take inverse transforms, to 
obtain 


{э} = (QO'-400 «33. (2 0) 


as the required solution. 


Check that in MATLAB the command 


maple(‘rsolve({8*y (k+2) -—6*y (k+1) +y (k) =9,y(0)=1, 
bae) LS P AN ec s NN 


returns the calculated answer or alternatively use the command rsolve in MAPLE. 


Example 6.17 


Solution 
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Solve the difference equation 
Уњ + 2уь= 0 (k= 0) 


given that y = 1 and y, = 42. 


Taking z transforms, we have 
[Z YZ) ^ z?yy - zu] - 2Y(2) 2 0 

and substituting the given values of y, and y, gives 
z^Y(z) - z?- 2z * 2Y(z) 20 

Or 
(z? + 2)¥(z) =2° + {22 

Resolving Y(z)/z into partial fractions gives 


Ү(2) 2+2 _ 2+ 4/2 
Yz).zt*42.  zt42 


z Z2 (z*j2)Yz-j42) 


Following the approach adopted in Example 6.13, we write 





j2 = \2 e uL Е j 2 = 2 e^? 
Y(z). 2+2 . Qsjyi2. (1-02 
z  (z-jAe""yz-4287^) 2-420"? 2-20” 
Thus 


1 ] ] 
йөз йй а 0-0 тена 


i -jn/2 
z- 2e? 


which on taking inverse transforms gives 


k/2 
2 


{y} = 2а yije”? e ре”! 


- (2" (cos kn e sinikn)) (kz 0) 


as the required solution. 


The solution in Example 6.17 was found to be a real-valued sequence, and this 
comes as no surprise because the given difference equation and the ‘starting’ values yo 
and y, involved only real numbers. This observation provides a useful check on the 
algebra when complex partial fractions are involved. 
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J 


If complex partial fractions are involved then, as was mentioned at the end of Ex- 
ample 6.10, it is difficult to simplify answers when determining inverse z transforms 
using MATLAB. When such partial fractions arise in the solution of difference 
equations use of the command evalc alongside rsolve in MAPLE attempts to 
express complex exponentials in terms of trigonometric functions, leading in most 
cases to simplified answers. 

Considering the difference equation of Example 6.17, using the command 


maple('rsolve(íy(k«2)-«2*y(k)20,y (0)21,y (1) 
= ОЕ s Y 


in MATLAB returns the answer 
СЕРКЕ И Оа) US ORT DES 
whilst using the command 
maple(‘evalc(rsolve({y (k+2) +2*y (k)=0,y(0)=1,y(1) 
= Tb 220 M ares c M) 
returns the answer 
exqpi(1/2»* Tog 2) 9) соз (ара) ехо (2 Пос (25) У) 
лги 5 ои) 


Noting that e^? = 2 it is readily seen that this corresponds to the calculated 
answer 


k/2 1 nw 
2" (cos; kn + sin; kT) 


6.5.3 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


14 Find difference equations representing the discrete- (а) Ум» — Zye * y, 7 0 subject to y, 20, y, - 1 
time systems shown in Figure 6.6. 





(b) y, — 8y,4 — 9y, 2 0 subject to yy 22, y, - 1 

(с) yu5 * 4y,2 0 subject to yj 20, y, 2 1 

(d) 2y,? — 5yj4 — 3y, 7 0 subject to y, 2 3, y, 2 2 
16 Using z-transform methods, solve the following 

difference equations: 

(a) бу» + yy — yy 7 3 subject to y 2 yi 2 0 

(Б) умо — 3yy4 * 6y, 7 5 subject to y, 20, y, 7 1 

(с) уно — 5yna * 6, 7 (3) subject to y, 2 y, = 0 





Figure 6.6 The systems for Exercise 14. (d) Y = IYn * 3y, 7 1 subject to yy - 1, = 0 


15 Using z-transform methods, solve the following 


(e) 2y,,5 — Зу, — 2y, = 6n + 1 subject to yọ = 1, 
у =2 


difference equations: (f) y, — Ay, 7 3n — 5 subject to yy =), =0 


17 


18 


T9) 
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A person's capital at the beginning of, and expenditure 
during, a given year k are denoted by C, and E, 
respectively, and satisfy the difference equations 


C4 7 1.5C,— E, 
Ej; 2 0.21C, * 0.5E, 


(a) Show that eventually the person's capital 
grows at 2096 per annum. 

(b) Ifthe capital at the beginning of year 1 is £6000 
and the expenditure during year 1 is £3720 then 
find the year in which the expenditure is a 
minimum and the capital at the beginning of 
that year. 


The dynamics of a discrete-time system are 
determined by the difference equation 
Уыз 5уњ + OV, = Ux 


Determine the response of the system to the unit 
step input 


f (k < 0) 

uU, = 

1 (k20) 

given that yy — y, = 1. 

As a first attempt to model the national economy, 


it is assumed that the national income /, at year k 
is given by 


I= C, + P+ G; 


where C, is the consumer expenditure, P, is private 
investment and G, is government expenditure. 

It is also assumed that the consumer spending is 
proportional to the national income in the previous 
year, so that 


20 


C,=al,, (0<a<1) 


It is further assumed that private investment is 
proportional to the change in consumer spending 
over the previous year, so that 


Pp=W(Q,- Ga) <b 1) 


Show that under these assumptions the national 
income /, is determined by the difference equation 


Iua = a(l + b)i + abh, = Gua 


Ifa= i , b= 1, government spending is at a constant 
level (that is, G, = G for all k) and J, = 2G, 
I, 2 3G, show that 


I,-2[1 4 (1)? sin E kx]G 
Discuss what happens as k ee. 
The difference equation for current in a particular 
ladder network of N loops is 
Riina + Ripa = in) + Raina іо) = 0 
(0x nz N-2) 


where i, is the current in the (n + 1)th loop, and К, 
and R, are constant resistors. 


(a) Show that this may be written as 
ing — 2 coshai,,;+1,=0 (0SnS<N-2) 


where 
a = cosh” ( + t) 
2R, 
(b) By solving the equation in (a), show that 


"m i,sinh na - ip sinh(n - 1)a 


Ж - (2 € n « N) 
sinha 


Discrete Ііпеаг ѕуѕїетѕ: ЯР Л 


In this section we examine the concept of a discrete-time linear system and its difference 
equation model. Ideas developed in Chapter 5 for continuous-time system modelling 
will be seen to carry over to discrete-time systems, and we shall see that the z transform 
is the key to the understanding of such systems. 


6.6.1 z transfer functions 


In Section 5.6, when considering continuous-time linear systems modelled by differential 
equations, we introduced the concept of the system (Laplace) transfer function. This is a 
powerful tool in the description of such systems, since it contains all the information 
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on system stability and also provides a method of calculating the response to an 
arbitrary input signal using a convolution integral. In the same way, we can identify a 
z transfer function for a discrete-time linear time-invariant system modelled by a difference 
equation, and we can arrive at results analogous to those of Chapter 5. 

Let us consider the general linear constant-coefficient difference equation model for 
a linear time-invariant system, with input sequence {w,} and output sequence {y,}. Both 
{u,} and {y,} are causal sequences throughout. Such a difference equation model takes 
the form 


аһУкп + Os AY aai F An-2Vkin-2 F - +- + AVE 
= bu, * b, ua * b, ois 2 +...+ Бош, (6.32) 


where k 7 0 and n, m (with n = m) are positive integers and the a; and b, are constants. 
The difference equation (6.32) differs in one respect from the examples considered in 
Section 6.5 in that the possibility of delayed terms in the input sequence {u,} is also 
allowed for. The order of the difference equation is n if a,# 0, and for the system 
to be physically realizable, n = m. 

Assuming the system to be initially in a quiescent state, we take z transforms 
throughout in (6.32) to give 


(a,z" t a, uz"! +...+ a9) Y(z) = (b,,2" + b yz" +... + by) U(z) 


where Y(z) 2 Z(y,) and U(z) = #{u,}. The system discrete or z transfer function G(z) 
is defined as 


m m-1 
G(z)- Ү(2) _ Б,2 Tb. az +... + bo 


WG) Via dedy (9:99) 


and is normally rearranged (by dividing numerator and denominator by a,) so that the 
coefficient of z" in the denominator is 1. In deriving G(z) in this form, we have assumed 
that the system was initially in a quiescent state. This assumption is certainly valid for 
the system (6.32) if 


»=уу=...=)һа=0 
üy—ui-..—uy4-0 
This is not the end of the story, however, and we shall use the term ‘quiescent’ to mean 


that no non-zero values are stored on the delay elements before the initial time. 
On writing 


P(zye B2" 4 b, as Pac ey 
Q(z) - az" t a, zz" +... + a 
the discrete transfer function may be expressed as 
соу 200) 
Q(z) 


As for the continuous model in Section 5.6.1, the equation Q(z) = 0 is called the 
characteristic equation of the discrete system, its order, n, determines the order of the 
system, and its roots are referred to as the poles of the discrete transfer function. Like- 
wise, the roots of P(z) = 0 are referred to as the zeros of the discrete transfer function. 


Example 6.18 


Solution 


Figure 6.7 

(a) The basic second- 
order block diagram 
substructure; (b) block 
diagram representation 
of (6.34). 


Figure 6.8 (a) The 
z-transform domain 
basic second-order 
block diagram 
substructure; 

(b) the z-transform 
domain block 

diagram representation 
of (6.34). 
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Draw a block diagram to represent the system modelled by the difference equation 


Уыз + Зуш — Ve = Uk (6.34) 


and find the corresponding z transfer function. 


The difference equation may be thought of as a relationship between adjacent members 


of the solution sequence {y,}. Thus at each time step К we have from (6.34) 
Yij2 7 —3ypa * Y t Uy (6.35) 


which provides a formula for y} involving Yp Yp and the input u. The structure shown 
in Figure 6.7(a) illustrates the generation of the sequence {y,} from (y,,;) using two 
delay blocks. 


Ò {Ук +2} [> | {Ук+1) [5] {ук} 


(а) (b) 





We now use (6.35) as a prescription for generating the sequence {y,,,} and arrange 
for the correct combination of signals to be formed at each step k at the input summing 
junction S of Figure 6.7(a). This leads to the structure shown in Figure 6.7(b), which is 
the required block diagram. 

We can of course produce a block diagram in the z-transform domain, using a similar 
process. Taking the z transform throughout in (6.34), under the assumption of a quiescent 
initial state, we obtain 

z^Y(z) - 3zY(z) - Y(z) 2 U(z) (6.36) 
Or 
z^Y(z) 2 -3zY(z) * Y(z) - U(z) (6.37) 


The representation (6.37) is the transform-domain version of (6.35), and the z-transform 
domain basic structure corresponding to the time-domain structure of Figure 6.7(a) is 
shown in Figure 6.8(a). 


5 гу Y Ү(2) 
4 


(а) (b) 





The unit delay blocks, labelled D in Figure 6.7(a), become ‘1/z’ elements in the 
z-transform domain diagram, in line with the first shift property (6.15), where a number 
ko of delay steps involves multiplication by z^. 

It is now a simple matter to construct the ‘signal’ transform z’Y(z) from (6.37) and 
arrange for it to be available at the input to the summing junction S in Figure 6.8(a). 


The resulting block diagram is shown in Figure 6.8(b). 
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Example 6.19 


Solution 


The z transfer function follows at once from (6.36) as 


Ge) = == (6.38) 
U(z) z +3z-1 


A system is specified by its z transfer function 
бъ ; 2-1 
2+ 32+ 2 


What is the order n of the system? Can it be implemented using only и delay elements? 
Illustrate this. 


If {u,} and {y,} denote respectively the input and output sequences to the system 
then 


са 


U(z) 22+32+2 
so that 
(z? + 3z + 2)¥(z) = (z - D)U(z) 


Taking inverse transforms, we obtain the corresponding difference equation model 
assuming the system is initially in a quiescent state 


Ya * 3yya * 2yy = Up — U, (6.39) 


The difference equation (6.39) has a more complex right-hand side than the difference 
equation (6.34) considered in Example 6.18. This results from the existence of z 
terms in the numerator of the transfer function. By definition, the order of the 
difference equation (6.39) is still 2. However, realization of the system with two 
delay blocks is not immediately apparent, although this can be achieved, as we shall 
now illustrate. 

Introduce a new signal sequence {r,} such that 


(z? + 3z + 2)R(z) = Uz) (6.40) 


where A(z) 2 Zír,). In other words, (r;) is the output of the system having transfer 
function 1/(z* + 3z + 2). 
Multiplying both sides of (6.40) by z, we obtain 


2(z* + 3z + 2)R(z) = zU(z) 
or 

(z? + 3z + 2)zR(z) = zU(z) (6.41) 
Subtracting (6.40) from (6.41) we have 

(2? + 3z + 2)zR(z) — (2? + 3z + 2)R(z) = ZU(z) — U(z) 
giving 


(22 + 3z + 2)[zR(z) — R(z)] = (2 — 1I)U(z) 
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Figure 6.9 The z-transform block diagrams for (a) the system (6.40), (b) the system (6.39), and (c) the time-domain 
realization of the system in Example 6.19. 


Example 6.20 


Solution 


Finally, choosing 
Y(z) 2 zR(z) - R(z) (6.42) 
(к? + 32 + 2)У(т) = (z - DU(z) 


which is a realization of the given transfer function. 

To construct a block diagram realization of the system, we first construct a block 
diagram representation of (6.40) as in Figure 6.9(a). We now ‘tap off’ appropriate 
signals to generate Y(z) according to (6.42) to construct a block diagram representation 
of the specified system. The resulting block diagram is shown in Figure 6.9(b). 

In order to implement the system, we must exhibit a physically realizable time-domain 
structure, that is one containing only D elements. Clearly, since Figure 6.9(b) contains 
only ‘1/z’ blocks, we can immediately produce a realizable time-domain structure as 
shown in Figure 6.9(c), where, as before, D is the unit delay block. 


A system is specified by its z transfer function 


2 


G(z) = тет 
z +0.3z+0.02 


Draw a block diagram to illustrate a time-domain realization of the system. Find a 
second structure that also implements the system. 


We know that if #{u,} = U{z} and #{ y,} = Y(z) are the z transforms of the input and 
output sequences respectively then, by definition, 


пзе A (6.43) 


U(z) 2 +0.3z+0.02 
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Figure 6.10 (a) The z-transform block diagram for the system of Example 6.20; and (b) the time-domain 
implementation of (a). 


which may be rewritten as 
(22 + 0.3z - 0.02)Y(z) 2 zU(z) 


Noting the presence of the factor z on the right-hand side, we follow the procedure of 
Example 6.19 and consider the system 


(22 + 0.3z + 0.02)R(z) = U(z) (6.44) 
Multiplying both sides by z, we have 
(22 + 0.32 + 0.02)zR(z) = zU(z) 


and so, if the output Y(z) = zR(z) is extracted from the block diagram corresponding to 
(6.44), we have the block diagram representation of the given system (6.43). This is 
illustrated in Figure 6.10(a), with the corresponding time-domain implementation 
shown in Figure 6.10(b). 

To discover a second form of time-domain implementation, note that 


Z 2 1 


Gan = 
г +0.32+ 0.02 24+0.2 2+0.1 








We may therefore write 








ЕЕ (: = 2 z a Juo 
so that 
Y(z) = Riz) - R2) 
where 
_ 2 
R\(z) = 2503 U(z) (6.452) 
EN 
R,(z) = Y U(z) (6.45b) 


From (6.452), we have 


(z + 0.2)R,(z) = 2U(z) 


Figure6.11 Theblock 
diagrams for (a) the 
subsystem (6.452), 

(b) the subsystem 
(6.45b), and (c) an 
alternative z-transform 
block diagram for 

the system of 
Example 6.20. 


6.6.2 
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which can be represented by the block diagram shown in Figure 6.11(a). Likewise, 
(6.45b) may be represented by the block diagram shown in Figure 6.11(b). 

Recalling that Y(z) = R,(z) — R,(z), it is clear that the given system can be represented 
and then implemented by an obvious coupling of the two subsystems represented by 
(6.45a, b). The resulting z-transform block diagram is shown in Figure 6.11(c). The 
time-domain version is readily obtained by replacing the ‘1/z’ blocks by D and the 
transforms U(z) and Y(z) by their corresponding sequences {u,} and { y,} respectively. 


The impulse response 


In Example 6.20 we saw that two quite different realizations were possible for the 
same transfer function G(z), and others are possible. Whichever realization of the 
transfer function is chosen, however, when presented with the same input sequence 
{u,}, the same output sequence 1 y,) will be produced. Thus we identify the system as 
characterized by its transfer function as the key concept, rather than any particular 
implementation. This idea is reinforced when we consider the impulse response sequence 
for a discrete-time linear time-invariant system, and its role in convolution sums. 
Consider the sequence 


{6,} = {1, 0, 0,...} 


that is, the sequence consisting of a single ‘pulse’ at 4 = 0, followed by a train of zeros. 
As we saw in Section 6.2.1, the z transform of this sequence is easily found from the 
definition (6.1) as 


FG, =1 (6.46) 


The sequence {6,} is called the impulse sequence, by analogy with the continuous- 
time counterpart 6(f), the impulse function. The analogy is perhaps clearer on con- 
sidering the transformed version (6.46). In continuous-time analysis, using Laplace 
transform methods, we observed (һа 2 5(0)} = 1, апа (6.46) shows that the ‘entity’ 
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Example 6.21 


Solution 


with z transform equal to unity is the sequence {6,}. It is in fact the property that 
¥{6,} = 1 that makes the impulse sequence of such great importance. 

Consider a system with transfer function G(z), so that the z transform Y(z) of the 
output sequence { y,} corresponding to an input sequence (u,) with z transform U(z) is 


Y) - G(z)U(z) (6.47) 


If the input sequence {y,} is the impulse sequence {6,} and the system is initially 
quiescent, then the output sequence {ys} is called the impulse response of the system. 
Hence 


у= 50) = С) (6.48) 


That is, the z transfer function of the system is the z transform of the impulse response. 
Alternatively, we can say that the impulse response of a system is the inverse z trans- 
form of the system transfer function. This compares with the definition of the impulse 
response for continuous systems given in Section 5.6.3. 

Substituting (6.48) into (6.47), we have 


YC) = hOU) (6.49) 


Thus the z transform of the system output in response to any input sequence (,) 1s the 
product of the transform of the input sequence with the transform of the system impulse 
response. The result (6.49) shows the underlying relationship between the concepts of 
impulse response and transfer function, and explains why the impulse response (or the 
transfer function) is thought of as characterizing a system. In simple terms, if either of 
these is known then we have all the information about the system for any analysis we 
may wish to do. 


Find the impulse response of the system with z transfer function 


G(z)==— 
z+3z4+2 
Using (6.48), 
¥;(z) = =—— = Z 


243242 (2+2)(2+1) 
Resolving Y;(z)/z into partial fractions gives 


Y5(z) _ 1 2.1 A 
2 (2+2)(2+1) 2+1 2+2 





which on inversion gives the impulse response sequence 





2+1 2+2 


{= 2" Es = | 


= {(-1)"-(-2)"} (k= 0) 


Example 6.22 


Solution 
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Since the impulse response of a system is the inverse z transform of its transfer func- 
tion G(z) it can be obtained in MATLAB using the command 


syms k z 
Ше Шс (Е) Э) 


so for the G(z) of Example 6.21 


syms k z 
Bitter ci zo No E) EE eT M MEM) 


returns 
me е о Ее 
A plot of the impulse response is obtained using the commands 
Eoi zT 
G=G (z); 
impulse (G) 


Likewise in MAPLE the command 
invztrans(z/(z^243*z42),z,k); 


returns the same answer 
(-1)* - (-2)* 


A system has the impulse response sequence 
Ds, 7 (a* - 0.55 


where a > 0 is a real constant. What is the nature of this response when (a) a = 0.4, 
(b) a= 1.2? Find the step response of the system in both cases. 


When a = 0.4 
Usi = (0.4* — 0.5 


and, since both 0.4* — 0 as К — e» and 0.5* — 0 as k — ee, we see that the terms of the 
impulse response sequence go to zero as k > ©, 

On the other hand, when a = 1.2, since (1.2) > as k > ©, we see that in this case 
the impulse response sequence terms become unbounded, implying that the system 
“blows up’. 

In order to calculate the step response, we first determine the system transfer function 
G(z), using (6.48), as 


G(z) = Yy(z) 7 Z(a* — 0.5 
giving 


Dis o E 
(2) z-a 2- 0.5 








The system step response is the system response to the unit step sequence {h,} = 
{1, 1, 1,... } which, from Figure 6.3, has z transform 
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Bl hy} = 


Hence, from (6.46), the step response is determined by 








Y(z) - со) т) = ( 2 — z 


























z-a z-0.5/z-1 
so that 
Е se 
2 (z-a)(z-1) (z-0.5)(z- 1) 
а-12-а z-0.5 1-a/z-1 
giving 
Иде +(-24 bn | ne 
a-lz-a z-0.5 1-a/z-1 
which on taking inverse transforms gives the step response as 
(a 4 - (0.5 «(22^ 1 ) (6.50) 
а-1 1-а 


Considering the output sequence (6.50), we see that when a = 0.4, since (0.4)* > 0 
as k — oe (and (0.5)! > 0 as k > co), the output sequence terms tend to the constant 
value 


1.243333 


—2+ = 
1-0.4 





In the case of a — 1.2, since (1.2)! — оо as k — ee, the output sequence is unbounded, 
and again the system ‘blows up’. 


6.6.3 Stability 


Example 6.22 illustrated the concept of system stability for discrete systems. When 
а = 0.4, the impulse response decayed to zero with increasing k, and we observed 
that the step response remained bounded (in fact, the terms of the sequence 
approached a constant limiting value). However, when a = 1.2, the impulse response 
became unbounded, and we observed that the step response also increased without 
limit. In fact, as we saw for continuous systems in Section 5.6.3, a linear constant- 
coefficient discrete-time system is stable provided that its impulse response goes to zero 
as t — oo. As for the continuous case, we can relate this definition to the poles of the 
system transfer function 


_ P2) 
G(z) = 
=O 


As we saw in Section 6.6.1, the system poles are determined as the n roots of its charac- 
teristic equation 


О(2) = а,2" +а, 12" | +...+a)=0 (6.51) 
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For instance, in Example 6.19 we considered a system with transfer function 

G(z) = TL 

Zu 3242 

having poles determined by z* + 3z + 2 = 0, that is poles atz 2 —1 and z ^ —2. Since the 
impulse response is the inverse transform of G(z), we expect this system to ‘blow up’ 
or, rather, be unstable, because its impulse response sequence would be expected to 
contain terms of the form (-1)* and (-2)*, neither of which goes to zero as k — ee. 
(Note that the term in (—1)* neither blows up nor goes to zero, simply alternating 
between +1 and —1; however, (—2)* certainly becomes unbounded as k > оо.) Оп 
the other hand, in Example 6.20 we encountered a system with transfer function 


2 


G(z) = 5————— 
2 +0.3z+0.02 


having poles determined by 
O(z) = 22 + 0.32 + 0.02 = (2 + 0.2)(2 + 0.1) = 0 


that is poles at z 2 —0.2 and z = —0.1. Clearly, this system is stable, since its impulse 
response contains terms in (—0.2)* and (-0.1)*, both of which go to zero as k — ee. 

Both of these illustrative examples gave rise to characteristic polynomials Q(z) 
that were quadratic in form and that had real coefficients. More generally, O(z) = 0 
gives rise to a polynomial equation of order n, with real coefficients. From the theory 
of polynomial equations, we know that Q(z) = 0 has n roots oz (i 2 1, 2, ... , n), which 
may be real or complex (with complex roots occurring in conjugate pairs). 

Hence the characteristic equation may be written in the form 


Q2) - a(z — o3) — o3) ...(z- 0) 20 (6.52) 


The system poles œ (i= 1,2, ... , n) determined by (6.52) may be expressed in the polar 
form 


o,-r,e^ (i21,2,...,n) 


where 0, — 0 or n if or is real. From the interpretation of the impulse response as the 
inverse transform of the transfer function G(z) 2 P(z)/Q(z), it follows that the impulse 
response sequence of the system will contain terms in 


rhe rE AS, rk eit 
Since, for stability, terms in the impulse response sequence must tend to zero as 
k + ce, it follows that a system having characteristic equation Q(z) = 0 will be stable 


provided that 


r,<1 for i=1,2,...,n 


Therefore a linear constant-coefficient discrete-time system with transfer function 
G(z) is stable if and only if all the poles of G(z) lie within the unit circle |z| < 1 in 
the complex z plane, as illustrated in Figure 6.12. If one or more poles lie outside 
this unit circle then the system will be unstable. If one or more distinct poles lie on 
the unit circle |z| = 1, with all the other poles inside, then the system is said to be 
marginally stable. 
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Figure 6.12 Region of 
stability in the z plane. 


Example 6.23 


Solution 





Which of the following systems, specified by their transfer function G(z), are stable? 


2 


b) Ge) =- (c) GE) = 2 


1 2 
2+ 0.25 zZ-z405 z-32 252-1 


(a) G(z) - 


(a) Тһе single pole is at z 2 —0.25, so r, 2 0.25 « 1, and the system is stable. 
(b) The system poles are determined by 
z!-z-0.5-2[z- 0.5(1 - D][z - 0.51 5] 20 


giving the poles as the conjugate pair z; = 0.5(1 + j), z = 0.5(1 — j). The ampli- 
tudes r, =r, = 0.707 < 1, and again the system is stable. 


(c) The system poles are determined by 
z?—3z? - 2.5z - 1 - (z - 2)[z - 0.5(1 + j)][z — 0.51 — j)] 


giving the poles as z, = 2, z, = 0.5(1 + j), z, = 0.5(1 — j), and so their amplitudes 
are r, 22, rj; =r; = 0.707. Since r, > 1, it follows that the system is unstable. 


According to our definition, it follows that to prove stability we must show that all 
the roots of the characteristic equation 


O(z) =z" +4, 2" +...+ау= 0 (6.53) 


lie within the unit circle |z| 2 1 (note that for convenience we have arranged for the 
coefficient of z” to be unity in (6.53) ). Many mathematical criteria have been developed 
to test for this property. One such method, widely used in practice, is the Jury stability 
criterion introduced by E. I. Jury in 1963. This procedure gives necessary and suffi- 
cient conditions for the polynomial equation (6.53) to have all its roots inside the unit 
circle |z| = 1. 

The first step in the procedure is to set up a table as in Figure 6.13 using information 
from the given polynomial equation (6.53) and where 




















1 ак bo Ё, CQ — Cy-2-k 
b, = , Cy — , d, = , > 
do а, bni by C,-2 Ck 
Yo Г) 
ty = 
F2 РО 








Figure 6.13 Jury 
stability table for the 
polynomial equation 
(6.53). 


Example 6.24 


Solution 
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Row z^ gel zn are gir ie z z! z 
1 1 ал-\ dn xis sk uos ay а, do 
2 do а, а — ак Nes @ 5 а, 1 
3 A, = by b b, iens by 34 b, ba 

4 ba [E bs 255 b, Е bi bo 

5 А = со € Cy oe сү жюз 69 

6 6-5 быз Cia Pg буз} — Co 

7 A = do d, d; meas dy 

8 йз d, a 4,5 e 4, 3 

2п – 5 ANN Sy 55 53 

2п-4 53 $5 Sy So 

2п-3 Атту "i Р 

2п – 2 Р T, To 

2п – 1 Nera 


Note that the elements of row 2j + 2 consist of the elements of row 27 + 1 written in the 
reverse order for j 2 0, 1, 2, ..., n; that is, the elements of the even rows consist of the 
elements of the odd rows written in reverse order. Necessary and sufficient conditions 
for the polynomial equation (6.53) to have all its roots inside the unit circle | z| 2 1 are 
then given by 
© Q0)>90, (-1'Qc1-70 

.. (6.54) 
(0). А,>0, А„>0, A,>O0, ..., А„„>0, А, >й0 


Show that all the roots of the polynomial equation 
Е) =2%+ 32-42-15 =0 


lie within the unit circle |z| = 1. 


The corresponding Jury stability table is shown in Figure 6.14. In this case 


(i) F)=1++-i-1i>0 


9; 4 12 
CAiyFC-0)2 CIycle-1-1-1)-0 
(i) Дд= 12 > 0, А,= (18) - + > 0 


144 144 


Thus, by the criteria (6.54), all the roots lie within the unit circle. In this case this 1s 
readily confirmed, since the polynomial F(z) may be factorized as 


F(z-(z- DG- i-a 1)-0 


So the roots are z, - 1,2, 2 -1 and z, - -1. 
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Figure 6.14 Jury 
stability table for 
Example 6.24. 


Example 6.25 


Figure 6.15 Jury 
stability table for 
Example 6.25. 


Solution 


























Row 2 2 z! 2° 
1 zl ls 
1 1 5 4 12 
zb =al 1 
2 12 4 3 1 
- o 1 
3 Ау = 
-2 1 zio d ZR i 
12 12 2 74 
= 143 = 5 2-2 
144 16 9 
-2 5 143 
4 9 16 144 
143 _2 
па 79 
5 А, = 
2 1B 
9 144 
= 0.936 78 


The Jury stability table may also be used to determine how many roots of the 
polynomial equation (6.53) lie outside the unit circle. The number of such roots is 
determined by the number of changes in sign in the sequence 


l A, A, ., A 


n-l 


Show that the polynomial equation 
F(-22-3z- iztí-0 


has roots that lie outside the unit circle |z| = 1. Determine how many such roots there are. 





Row 25 2? 2! 2° 
1 1 =3 -1 2 
2 à -i -3 І 
3 A=% E 2 

4 2 4 : 

5 A, = -2 


The corresponding Jury stability table is shown in Figure 6.15. Hence, in this case 


F@=1-3-1+3=-3 


(-1'FC1) = (C131 -3 +1 +3) =3 


As F(1) < 0, it follows from (6.54) that the polynomial equation has roots outside the 
unit circle |z| = 1. From Figure 6.15, the sequence 1, A,, A, is 1, i, -BE, and since 
there is only one sign change in the sequence, it follows that one root lies outside the 
unit circle. Again this 1s readily confirmed, since F(z) may be factorized as 

F(z)- €-ic-1G- 3)=0 


showing that there is indeed one root outside the unit circle at z = 3. 


6.6 DISCRETE LINEAR SYSTEMS: CHARACTERIZATION 523 


Example 6.26 Consider the discrete-time feedback system of Figure 6.16, for which T is the sampling 
period and Ё > 0 is a constant gain: 


(а) 
(b) 


(c) 
(d) 


Solution (a) 


(b) 


Figure 6.16 
Discrete-time system 


1 4, 
of Example 6.26. X р: zx 
T ‘Sampler s(s * 1) ] 


Determine the z transform G(z) corresponding to the Laplace transform G(s). 


Determine the characteristic equation of the system when T = 1 and k = 6 and 
show that the discrete-time system is unstable. 


For T= 1 show that the system is stable if and only if 0 < k < 4.33. 


Removing the sampler show that the corresponding continuous-time feedback 
system is stable for all k > 0. 


First invert the Laplace transform to give the corresponding time-domain func- 
tion f(t) and then determine the z transform of f(f): 


mau ud. 
GUIS s(stl) os stl 
К) = к ke" 


GG) - Z(lg – 00е) = E _ iz 2 kmü-6"))s 


zme Dee) 
With k= 6 and T= 1 
-1 
= 6(l1-e )z 


(z- D(-e) 
The closed loop transfer function is 
Ga(Z) 
1+ G,(z) 


giving the characteristic equation 


1+ G,(z) = 0 as (z — 1)(z — e™) + 6(1 — e™)z = 0 
or 

z? +z[60 -e - (1+ е)] + е! = 0 
which reduces to 

2? + 2.3242 + 0.368 = 0 


The roots of this characteristic equation are z, = —0.171 and z, = —2.153. Since 
one of the roots lies outside the unit circle | z| = 1 the system is unstable. 

















= Gs) 
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(c) For T= 1 and general gain k > 0 the characteristic equation of the system is 
F(z)=(z- 1z- e™) + k (1 -e)z = 0 
which reduces to 
F(z) =z? + (0.632k — 1.368)z + 0.368 = 0 
By Jury’s procedure conditions for stability are: 


Е(1) = 1 + (0.632k — 1.368) + 0.368 > 0 since k > 0 
2.736 


(-1)°F(-1) = 2.736 — 0.632k > 0 provided k < =2 = 4.33 
0.632 
ie | 1 "t =й 
0.568 1 


Thus F(1) 7 0, (C1F(-1) > 0 and A, > 0 and system stable if and only if < 4.33. 


(d) In the absence of the sampler the characteristic equation of the continuous-time 
feedback system 1s 1 + G(s) = 0, which reduces to 


stst+k=0 


All the roots are in the negative half of the s-plane, and the system is stable, for 
all k > 0. 


Convolution 


Here we shall briefly extend the concept of convolution introduced in Section 5.6.6 to 
discrete-time systems. From (6.45), for an initially quiescent system with an impulse 
response sequence {ys } with z transform Y;(z), the z transform Y(z) of the output 
sequence {y,} in response to an input sequence (u,) with z transform U(z) is given by 


Ү(2) = Ү;(2)0(2) (6.49) 


For the purposes of solving a particular problem, the best approach to determining {y,} 
for a given (u,) is to invert the right-hand side of (6.49) as an ordinary z transform with 
no particular thought as to its structure. However, to understand more of the theory of 
linear systems in discrete time, it i5 worth exploring the general situation a little further. 
To do this, we revert to the time domain. 

Suppose that a linear discrete-time time-invariant system has impulse response 
sequence {ys}, and suppose that we wish to find the system response ( y,j to an input 
sequence {u,}, with the system initially in a quiescent state. First we express the 
input sequence 


(uj) 2 (ug, uy, t, ... Un- } (6.55) 
as 
{м} = ug( Od  u(óLi + {Oper} +... + u, {Opn} +... (6.56) 
where 
be h (k =j) 
1 (k=j) 


Example 6.27 
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In other words, {6,_;} is simply an impulse sequence with the pulse shifted to X — j. 
Thus, in going from (6.55) to (6.56), we have decomposed the input sequence {v;} 
into a weighted sum of shifted impulse sequences. Under the assumption of an ini- 
tially quiescent system, linearity allows us to express the response {y,} to the input 
sequence {u,} as the appropriately weighted sum of shifted impulse responses. Thus, 
since the impulse response is {ys}, the response to the shifted impulse sequence 
{6,_} will be { Ya, s and the response to the weighted impulse sequence u,{6,_;} 
will be simply 1 J Summing the contributions from all the sequences in (6.56), 
we obtain 


ра} = У ш 0 (6.57) 


as the response of the system to the input sequence {u,}. Expanding (6.57), we have 


UJ = иу + шу +... + Ui Ya, t +... 


= Up {¥5,> X3,» Yo > X8,» "T 
+ и{0, ув, 8, jg reed 
+u{0, 0, Havers Yapa} 
+ u,{0, 0, 0, sus a yes Yope) 
T 
Pecs hth position 


From this expansion, we find that the Ath term of the output sequence is deter- 
mined by 


h 


yi 7 M ups, , (6.58) 
j-0 
That is, 
k 
{yd = 2 wa} (6.59) 
j-0 


The expression (6.58) is called the convolution sum, and the result (6.59) is analogous 
to (5.83) for continuous systems. 


A system has z transfer function 
G(z) » — 
2+3; 


2 


What is the system step response? Verify the result using (6.59). 
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Solution 


From (6.46), the system step response is 
Y) - G(z) (Ah, 
where {h} = {1, 1, 1,... }. From Figure 6.3, Z(h,) — z/(z — 1), so 


2 





Y(z) = — 
@) z+3z-1 


Resolving Y(z)/z into partial fractions gives 








Y(z) _ 2 až 1 1 1 
“zazi ? TES 
2 (z+5)(z- 1) gol “z+; 
so 
¥(z) = 2 +1-2_ 
@) 37-1 2+1 


Taking inverse transforms then gives the step response as 
{yb = C23 
Using (6.59), we first have to find the impulse response, which, from (6.48), is given by 


4 и 
{¥5,} = Z-[G@)] = Z B j 


so that 
ези 


Taking {u,} to be the unit step sequence {h,}, where h, = 1 (k = 0), the step response 
may then be determined from (6.59) as 


k k р 
{yak = {> оь = У 1: c» 


j-0 


k . k | 
= [co У cs - [co У cn 


Recognizing the sum as the sum to k+ | terms of a geometric series with common ratio 
—2, we have 


{у} = {cp cet] - (102) 2 1055 


which concurs with the sequence obtained by direct evaluation. 


Example 6.27 reinforces the remark made earlier that the easiest approach to 
obtaining the response is by direct inversion of (6.32). However, (6.59), together with 
the argument leading to it, provides a great deal of insight into the way in which the 
response sequence {y,} is generated. It also serves as a useful ‘closed form’ for the 
output of the system, and readers should consult specialist texts on signals and systems 
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for a full discussion (P. Kraniauskas, Transforms in Signals and Systems, Addison- 
Wesley, Wokingham, 1992). 

The astute reader will recall that we commenced this section by suggesting that we 
were about to study the implications of the input-output relationship (6.49), namely 


Y(z) - Yxz)U(z) 


We have in fact explored the time-domain input-output relationship for a linear system, 
and we now proceed to link this approach with our work in the transform domain. By 
definition, 


U(z) 2 Уи uz“ = mt AR TER... 
Ico Zz zZ Z 
Y;(z) = У у = ys go ea... piy 
k=0 2 2 
5О 
1 1 
Y(z)U(E) — tiny, + Choa, + иу) + биоув, UVa, + UVa); +... (6.60) 
2 


Considering the kth term of (6.60), we see that the coefficient of z ^is simply 


k 


> Hia, , 


j=0 


However, by definition, since Y(z) = Y;(z)U(z), this is also y(k), the Ath term of the 
output sequence, so that the latter is 


] 
{ye} = {> ua) 


as found in (6.59). We have thus shown that the time-domain and transform-domain 
approaches are equivalent, and, in passing, we have established the z transform of the 
convolution sum as 


k 
aly unl - U(z)V(z) (6.61) 


where 
Flug = Оо), 0) = Иб) 
Putting p = k — j in (6.61) shows that 
k k 
у иш, = Уи, (6.62) 
j-0 p=0 


confirming that the convolution process is commutative. 
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6.6.5 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Find the transfer functions of each of the following 27 Following the same procedure as in Example 6.26 

discrete-time systems, given that the system is show that the closed-loop discrete-time system of 

initially in a quiescent state: Figure 6.17, in which k > 0 and 7 — 0, is stable if 
and only if 


(а) уһ — Зу t 2yy 7 uy 





T 
(b) yu = Зум + 2 = ща — Ue 0<k<2 coth(;- ) 


(с) Ves — Vier + 2Уы + Vk = Up + Upi 




















Т 
Draw a block diagram representing the discrete- X k 
time system + sampler "ту +1) т 
Уһ + 0.5уь + 0.25у; = ир Es к>0,т>0 
Hence find a block diagram representation of the 
system 
Vi + OSV t 0.25y, — u, — 0.6u,,, Figure 6.17 Discrete-time system of Exercise 27. 


Find the impulse response for the systems with 


z transfer function ] . 
28 A sampled data system described by the difference 
2 


ü 
@ — 2 b) — 2 equation 

82 +62+1 z-3z+3 

Yai Уһ = и, 
г? 522 - 122 

OS on ot = is controlled by making the input u, proportional to 

a Ne Егете the previous error according to 
Obtain the impulse response for the systems of 1 
Exercises 21(a, b). нЕ E - Ya] 


where K is a positive gain. Determine the range of 


Which of the following systems are stable? . . I 
values of K for which the system is stable. Taking 


(a) 9yu5 * 9yu4 t 2y, 7 u, K = 2, determine the response of the system given 
9 p 
=y,=0. 
(b) 9yu; — 3yua 7 2y 7 uy жол 
(с) 2р – 25 + Yk = Uk — Ue 29 Show that the system 
(d) 2уыз * Sia Е Vi Е Uk Уһ Ez 2ysa + 2y, = иһ (п = 0) 
(е) ы —3уы-—Ж%= ны —2% has transfer function 
Use the method of Example 6.27 to calculate D) = z 
the step response of the system with transfer e" Z 42242 
function 
Show that the poles of the system are at z = —1 +j 
ete and z = —1 —j. Hence show that the impulse 
1 ‚ # 
25 response of the system is given by 


Verify the result by direct calculation. h,= ££" D(z) - 2"? sini nx 
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The relationship between Laplace and z transforms 


Figure 6.18 Sampled 
function f(t). 


Throughout this chapter we have attempted to highlight similarities, where they occur, 
between results in Laplace transform theory and those for z transforms. In this section 
we take a closer look at the relationship between the two transforms. In Section 6.2.2 
we introduced the idea of sampling a continuous-time signal f(t) instantaneously at 
uniform intervals T to produce the sequence 


aT) = 0, AT), SOT), ... finT), ... ] (6.63) 


An alternative way of representing the sampled function is to define the continuous- 
time sampled version of f(f) as f(t) where 


f(t) = Y f - nr) -Y Дат) ёи - пт) (6.64) 


п=0 


The representation (6.64) may be interpreted as defining a row of impulses located at 
the sampling points and weighted by the appropriate sampled values (as illustrated in 
Figure 6.18). Taking the Laplace transform of /(t), following the results of Section 5.5.10, 
we have 


ядо) Zenan e"d 


0 | k=0 





со 


= Zan | &(t - KT) e? dr 


giving 
SUfOY- Y fane (6.65) 


Making the change of variable z = e*” in (6.65) leads to the result 


LAO) =, (UT) г* = FG) (6.66) 


k=0 


ft) y-f(t) 


PE 


кә 
5 
чө 
3 
D 
"a 
x 
Os 
ч 
A 
"3 
~ 
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6.8.1 


where, as in (6.10), F(z) denotes the z transform of the sequence (/(kT)). We can 
therefore view the z transform of a sequence of samples in discrete time as the Laplace 
transform of the continuous-time sampled function f(¢) with an appropriate change of 
variable 

sT 1 

z=e" ог 5= = 102 
T 

In Chapter 4 we saw that under this transformation the left half of the s plane, Re(s) < 0, 
is mapped onto the region inside the unit circle in the z plane, |z| < 1. This is 
consistent with our stability criteria in the s and z domains. 


Solution of discrete-time state-space equations 


The state-space approach to the analysis of continuous time dynamic systems, developed 
in Section 5.7, can be extended to the discrete-time case. The discrete form of the state- 
space representation is quite analagous to the continuous form. 


State-space model 


Consider the nth-order linear time-invariant discrete-time system modelled by the 
difference equation 


Уњ + 21у + 25у +... + ау = Бош (6.67) 


which corresponds to (6.32), with b;= 0 (i > 0). Recall that {,} is the output sequence, 
with general term y,, and {u,} the input sequence, with general term u,. Following the 


procedure of Section 1.9.1, we introduce state variables x,(k), x,(k), ..., x,(k) for the 
system, defined by 
KK) = уь XAK) = Yen -eo х) = Уһ (6.68) 


Note that we have used the notation x(K) rather than the suffix notation x;, for clarity. 
When needed, we shall adopt the same convention for the input term and write u(k) for 
u, in the interests of consistency. We now define the state vector corresponding to this 
choice of state variables as x(k) =[x,(k) x(k) ... x,(k)]". Examining the system 
of equations (6.68), we see that 


xy(k * 1) 2 yu 7 xX(K) 
xk - 1) = уо = (00) 


Xp Ck + 1) = Veen = (0) 
Xalk + 1) = Vin 
= Ay укн – а 2Уньз — «s doy Буй 
= -а, 1х,(К) – а, ох, 1(К) =... = agxi(k) + bou(k) 


using the alternative notation for u, 
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We can now write the system in the vector-matrix form 


xG-D] [0 103 0 0 .. 0 ld [o 
x(k lo 0 1 0 .. O0 |x(A lo 

x(k+1) = =|: to 1 : © | +f. fue) 
' 0 0 0 0 1 : : 
x,(k+ 1) -а -@, -@, -—d4 ... -—da,4 |x«(K) bo 

(6.69) 


which corresponds to (1.60) for a continuous-time system. Again, we can write this 
more concisely as 


x(k+ 1) = Ax(k) + bu(k) (6.70) 


where A and b are defined as in (6.69). The output of the system is the sequence {y,}, 
and the general term y, = x,(k) can be recovered from the state vector x(k) as 


УЮ =х(Ю= П 0 0... xH = EK (6.71) 


As in the continuous-time case, it may be that the output of the system is a combination 
of the state and the input sequence (u(X)), in which case (6.71) becomes 


y(k) 2 e'x(K) * du(K) (6.72) 


Equations (6.70) and (6.72) constitute the state-space representation of the system, 
and we immediately note the similarity with (1.63a, b) derived for continuous-time 
systems. Likewise, for the multi-input—multi-output case the discrete-time state-space 
model corresponding to (1.692, b) is 


x(k + 1) = Ax(k) + Bu(k) (6.73a) 

y(k) = Cx(k) + Dut) (6.73b) 

Example 6.28 Determine the state-space representation of the system modelled by the difference 
equation 

Уњ + 0.2уы + 0.3у = иу (6.74) 


Solution We choose as state variables 
aW Syo AH = Ven 
Thus 
x(k + 1) = x(k) 
and from (6.74), 
x(k + 1) =—0.3x,(k) — 0.2x,(k) + ulk) 
The state-space representation is then 


x(k + 1)=Ax(k) + buf), — y) 2 e!x(k) 
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Figure 6.19 Block 
diagram of system 
with transfer 
function G(z) = 


(2 – 1)/(22 + 32+ 2). 


with 
aot? ® ы gsm o 
-03 -02 1 


We notice, from reference to Section 6.6.1, that the procedure used in Example 6.28 
for establishing the state-space form of the system corresponds to labelling the output 
of each delay block in the system as a state variable. In the absence of any reason for 
an alternative choice, this is the logical approach. Section 6.6.1 also gives a clue 
towards a method of obtaining the state-space representation for systems described by 
the more general form of (6.32) with m > 0. Example 6.19 illustrates such a system, 
with z transfer function 


G()- 4 2-1 
2 +32+2 


The block diagram for this system is shown in Figure 6.9(c) and reproduced for 
convenience in Figure 6.19. We choose as state variables the outputs from each delay 
block, it being immaterial whether we start from the left- or the right-hand side of the 
diagram (obviously, different representations will be obtained depending on the choice 
we make, but the different forms will yield identical information on the system). 
Choosing to start on the right-hand side (that is, with x,(&) the output of the right-hand 
delay block and x,(k) that of the left-hand block), we obtain 


x(k + 1) =x,(k) 
xX(k -- 1) 2 —3x;(k) — 2x,(k) + (к) 
with the system output given by 
VK) = 100) + (0) 
Thus the state-space form corresponding to our choice of state variables 1s 


x(k+ 1) = Ax(k) + bu(k), x(k) = x(k) 


a- {| ZH c-[ 1] 
2 3 1 


We notice that, in contrast with the system of Example 6.28, the row vector c’ = [-1 1] 
now combines contributions from both state variables to form the output y(K). 


with 
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6.8.2 


Example 6.29 


Solution 


Solution of the discrete-time state equation 


As in Section 1.10.1 for continuous-time systems, we first consider the unforced or 
homogeneous case 


x(k + 1) = Ax(K) (6.75) 
in which the input u(K) 1s zero for all time instants k. Taking k = 0 in (6.75) gives 
x(1) = Ax(0) 
Likewise, taking k — 1 in (6.75) gives 
х(2) = Ах(1) = А?х(0) 


and we readily deduce that in general 
x(k) 2 A'x(0) (k= 0) (6.76) 


Equation (6.76) represents the solution of (6.75), and is analogous to (1.80) for the 
continuous-time case. We define the transition matrix ®(k) of the discrete-time 
system (6.75) by 


@(k) = A* 
and it is the unique matrix satisfying 
Ф(К + 1) = АФ(®), Ф(о) = 1 


where l is the identity matrix. 
Since A is a constant matrix, the methods discussed in Section 1.7 are applicable for 
evaluating the transition matrix. From (1.34a), 


A = a (ÐI + æ (DA + a (HA? +... + oV (QA (6.77) 
where, using (1.34b), the œ (k) (k= 0,...,n-— 1) are obtained by solving simultane- 
ously the n equations 

i= AK) + AKA, + о„(ЮАў+...+ о„ (ЮА! (6.78) 


where À, (j= 1,2, . . . , n) are the eigenvalues of A. As in Section 1.7, if A has repeated 
eigenvalues then derivatives of A* with respect to A will have to be used. The method 
for determining A* is thus very similar to that used for evaluating e^' in Section 1.10.3. 


Obtain the response of the second-order unforced discrete-time system 


х\(®) 3 0 
x(k+1)= xU) = P x(k) 


win 


subject tox(0)=[1 1]. 


In this case the system matrix is 


A 2-0 


-1 1 
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having eigenvalues 2, = 1 and A4; - 1. Since A isa 2 x 2 matrix, it follows from (6.77) 
that 


А* = о (і + о (дА 
with O(k) and o,(k) given from (6.78), 

à= ak) + а1(0А, (= 1, 2) 
Solving the resulting two equations 

GY = ok) + (Ga), — X» ow) * G)ou (0) 
for a )(k) and a,(k) gives 

O(k) = 3-26), aK) = 6[G)*- GT 


Thus the transition matrix is 
k 
G) 0 


(k) = A*t = М ] ] 
e -0 1 0 


Note that ®(0) = I, as required. 
Then from (6.76) the solution of the unforced system is 


(D' 0 ||1 (1)* 
х(к+ 1) = " i i = i k 
H-H ef [0*-ed 


Having determined the solution of the unforced system, it can be shown that the 
solution of the state equation (6.732) for the forced system with input u(k), analogous 
to the solution given in (1.81) for the continuous-time system 


x=Ax+ Bu 


k-1 
x() - A'x(0) - V7 АД ОВ) (6.79) 


j-0 


Having obtained the solution of the state equation, the system output or response y(k) 
is obtained from (6.73b) as 


k-1 
y(b) » CA'x(0) - € V A^ Bu( j) + Du(k) (6.80) 


Jo 


In Section 5.7.1 we saw how the Laplace transform could be used to solve the state- 
space equations in the case of continuous-time systems. In a similar manner, z trans- 
forms can be used to solve the equations for discrete-time systems. 


Example 6.30 
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Defining Z(x(k)) 2 X(z) and Z(u(k)) = U(z) and taking z transforms throughout in 
the equation 
x(k + 1) = Ax(k) + Bu(k) 
gives 
zX(z) — zx(0) = AX(z) + BU(z) 
which, on rearranging, gives 
(zl - A)X(z) 2 zx(0) + BU(z) 
where l is the identity matrix. Premultiplying by (z1 — A) ! gives 
X(z) = 2(zl — A) 'x(0) + (zl — A)'BU(z) (6.81) 


Taking inverse z transforms gives the response as 
x(k) 2 4 £X(z)) 2 4 (z(zl - A) '1x(0) - £ (1I - A) 'BU(z)) (6.82) 


which corresponds to (5.89) in the continuous-time case. 
On comparing the solution (6.82) with that given in (6.79), we see that the transition 
matrix (f) 2 A* may also be written in the form 


DA = A‘ = {221 Ау!) 


This is readily confirmed from (6.81), since on expanding z(zl — A) ! by the binomial 
theorem, we have 


2 f 
HA А ра... 
Z z z 
- УА -= 2A) 


11 
о 
N 


Using the z-transform approach, obtain an expression for the state x(k) of the system 
characterized by the state equation 


х(к+ 1) = А. х(Ё) + : u(k) (kz90) 
-3 -6 1 
when the input is the unit step function 


«o (k « 0) 
1 (k=0) 


and subject to the initial condition x(0) 2 [1 —1]’. 
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Solution In this case 









































giving 
cl- Ay! = 1 z+6 5 
(z+ 1)(z+3)| -3 2—2 
5 3 5 5 
а ee, 
_|z+1 2+3 2+1 2+3 
3 3 3 5 
24 2 > 2 
z+1 z-3 2+1 2+3 
Then 
зе 2...2 ы 054 
б, = “| "ee Sees Agza urs 
#12021 Ау} = % 
EG NUN MESE EE 
2+1 2+3 22+ 2+3 


30-1) 2-3) $icp'-ioe» 
Ш) (ЭУ 


so that, with x(0) 2 [1 —1]', the first term in the solution (6.82) becomes 


F"'f2(zl — AY'}x(0) = | = (6.83) 


Since U(z) = F{u(k)} =z/(z - 1), 





авио а ERE 
(zl — A) Вст des] —3 ы [= 


_ 2 2+ 11 
© (Е—1)(:+1)(:+3)|5—5 


E E EA A 
z-1 ?z+1 z+3 





2 2 


2+1 2+3 











3 
2 


30 


31 


с^ 
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so that the second term in the solution (6.82) becomes 


3 
2 


Z^ (2 - A) 'BU()! - 


= (еуез 


(6.84) 


кш аы 3) 


Combining (6.83) and (6.84), the response x(k) is given by 


2-$c10f-2(03) 


x(k) = 


=p eel = 237" 


Having obtained an expression for a system’s state x(f), its output, or response, y(t) may 
be obtained from the linear transformation (6.73b). 


6.8.3 Exercises 


Check your answers using MATLAB or MAPLE whenever possible. 


Use z transforms to determine A* for the matrices 


(a) | ] (b) Ё ! (e) | | 
4 0 3 -1 0 -1 


Solve the discrete-time system specified by 
x(k + 1) = —7x(K) t Ay(K) 33 
y(k + 1) = —8х(®Ю) + x(k) 





with x(0) = 1 and y(0) = 2, by writing it in the form 
x(k + 1) = Ax(k). Use your answer to calculate x(1) 
and x(2), and check your answers by calculating 
x(1), y(1), x(2), y(2) directly from the given 
difference equations. 


Using the z-transform approach, obtain an 


expression for the state x(k) of the system 
characterized by the state equation 


x(k+1)= | 0 | + Hl u(k) 
-0.16 —1 1 


when the input is the unit step function 


“=| (k < 0) 
1 (k=0) 


and subject to the initial condition x(0) 2 [1 —1]’. 


The difference equation 
Wk + 2) = yk + 1) € y() 


with 1(0) = 0, and )(1) = 1, generates the Fibonacci 
sequence { (k)}, which occurs in many practical 
situations. Taking x(k) = (A) and x,(k) = y(k + 1), 
express the difference equation in state-space form 
and hence obtain a general expression for y(X). 
Show that as k — ee the ratio y(k -- 1)/y(k) tends 

to the constant i (\5 + 1). This is the so-called 
Golden Ratio, which has intrigued mathematicians 
for centuries because of its strong influence on art 
and architecture. The Golden Rectangle, that is one 
whose two sides are in this ratio, is one of the most 
visually satisfying of all geometric forms. 
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6.9.1 


Discretization of continuous-time state-space models 


In Sections 1.10 and 5.7 we considered the solutions of the continuous-time state-space 
model 


X(t) = Ax(t) + Bu(t) (6.85a) 
y(t) = Сх(ї) (6.85b) 


If we wish to compute the state x(f) digitally then we must first approximate the continuous 
model by a discrete-time state-space model of the form 


x[(k + 1)T] = Gx(kT) + Hu(kT) (6.86a) 
y(kT) 2 Cx(kT) (6.86b) 
Thus we are interested in determining matrices G and H such that the responses to the 
discrete-time model (6.86) provide a good approximation to sampled-values of the 
continuous-time model (6.85). We assume that sampling occurs at equally spaced 


sampling instances t = KT, where T > 0 is the sampling interval. For clarification we 
use the notation x(AT) and x[(k + 1)7] instead of k and (k + 1) as in (6.73). 


Euler's method 


A simple but crude method of determining G and H is based on Euler's method con- 
sidered in Section 10.6 of Modern Engineering Mathematics. Here the derivative of the 
state is approximated by 


x(t +T) -x(T 


x(t) = T 


which on substituting in (6.85a) gives 


x(t +T) -x(t 


i = Ax(t) + Bu(t) 

which reduces to 

x(t + T) = (TA 9 Dx(t) - TBu(t) (6.87) 
Since ¢ is divided into equally spaced sampling intervals of duration T we take t = kT, 
where & is the integer index k= 0, 1, 2,..., so that (6.87) becomes 

x[(k + 1)T] = (TA + D)x(kT) + TBu(kT) (6.88) 
Defining 

G = G, = (TA + I) and H = H, = TB (6.89) 


(6.86) then becomes the approximating discrete-time model to the continuous-time 
model (6.85). This approach to discretization is known as Euler’s method and simply 
involves a sequential series of calculations. 
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Example 6.31 ^ Consider the system modelled by the second-order differential equation 


(a) 
(b) 


(c) 


Solution (a) 


(b) 


V(t) + 3y(t) * 2y = 2u(t) 
Choosing the state-vector x = [y y]" express this in a state-space form. 


Using Euler’s method, determine the approximating discrete-time state-space 
model. 


Illustrate by plotting the responses y(¢), for both the exact continuous response 
and the discretized responses, for a step input u(f) = 1 and zero initial conditions, 
taking T= 0.2 


Since x; = y, X, = ý we have that 
Х\=ў=х% 
Xo = P=—2x, — 3x, + 2u 


so the state-space model is 
[0 ЦА 0с 
X -2 -3| |2 2 
x 
у= [1 af 
X2 


From (6.89) 


G,=TA+I=| ! T 
-2T -3T+1 


Н,= ТВ = 0 
2Т 
so the discretized state-space model is 
xji[(& +1)Т] _ 1 Т xı(kT) x 
xj[Gc 1)T] -2T -3T-«1||xXXT) 


y&kT)- [1 0] К. | 


> и(КТ) 


х(КТ) 


Using the MATLAB commands: 


А = [0,1;-2,-31; В = [0;21; С = [1,0]; 


е О 
for T - 0.2 
k = k + 1; 


СТ. Тра Т Зя [НА = ГОТ], 
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Figure 6.20 
Discretization using 
Euler's method. 


6.9.2 


пто ЗОЛ 

y = step(A,B,C,0,1,t); yd = dstep(G1,H1,C,0,1,31); 
plot(t,y,t,yd,'x') 

end 


step responses for both the continuous model and the Euler discretized model are 
displayed in Figure 6.20 with *x^ denoting the discretized response. 





Step response 











Step-invariant method 


To determine the matrices G and H in the discrete-time model (6.86), use is made of 
the explicit solution to the state equation (6.85a). From (1.81) the solution of (6.85a) is 
given by 


A(t-1,) 


xin cg uy | e^  Bu(t,) dt, (6.90) 


to 
Taking f, — KT and t = (k + 1)T in (6.90) gives 


(k+1)T 
А[(Е+1)Т-т, 


х[(К+1)Т] = Ај 'Bu(t,) dt, 


kT 


Making the substitution t= t, — AT in the integral gives 


T 


x[(k+1)T] = eun] e“? Bu(kT+ 7) dv (6.91) 


0 
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The problem now is: How do we approximate the integral in (6.91)? The simplest 
approach is to assume that all components of u(t) are constant over intervals between 
two consecutive sampling instances so 


и(КТ + т) = и(КТ), O<T<T, k=0,1,2,... 


The integral in (6.91) then becomes 


T 
| | gg a u(kt) 
0 


Defining 
С = ей (6.92а) 
Т T 
and H= | еВ ат = | e^'B dt, using substitution t 2 (T — 7) (6.920) 
0 0 
then (6.91) becomes the discretized state equation 
x[(k + 1)T] = Gx(kT) + Hu(kT) (6.93) 


The discretized form (6.93) is frequently referred to as the step-invariant method. 


Comments 
1. From Section (5.7.1) we can determine G using the result 
e“ = 9-!{(51— А) '} (6.94) 


2. If the state matrix A is invertible then from (1.37) 
T 
H= | е“В іт = АС – 1)В = (С – АВ (6.95) 
0 


3. Using the power series expansion of e“ given in (1.27) we can express G and H 
as the power series 








7 TAM _ 98: TA 

С=1+ТА + zT scd 2 (6.96) 
_ Т?А [v TA 

Н = (Т + pa +..)В= 5» p (6.97) 


We can approximate G and H by neglecting higher-order terms in 7. In the par- 
ticular case when we neglect terms of order two or higher in 7 results (6.97) give 


G= l+ TA and H = ТВ 


which corresponds to Euler’s discretization. 
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Example 6.32 Using the step-invariant method, obtain the discretized form of the state equation for 
the continuous-time system 


x= p = 0 1 + 0 u(t) 
X? -2 -3|»m 2 
considered in Example 6.31. Plot the response (AT) = [1 O]x(AT), for a step input 


u(t) = | and zero initial conditions, taking T= 0.2. 


T 


Solution Using (6.93) G = e“ and H = | e^ B dt. From (6.94) 


0 


6-iqa- aps EP || А = (s +2)(s+ 1) 














А| =2 5 

__1 d 2 1 z 1 
=p) st+2 stl s+2 stl 

2 2 2 1 














so that 


and 
T i К 6 eate 
н- | Ray = le” - 2e” le” - e” 
0 -e 420% -e +e" б 


E e” - 2e” +1 
29e 7 ов 


Thus, the discrete form of the state equation is 


-2T -T -2T -T 
meson- 5 TRE pens 


Je gg ga ug -2e" -e 


-2T -T 
F - 2e 2 u(kT) 
-2T =f 


In the particular case T = 0.2 the state equation is 


x[(k + 1)0.2]= | 


0.9671 0.1484) 4o 5, 4| 60329 | (155 
-0.2968 0.5219 -0 


968 


Using MATLAB step responses for both the continuous-time model and the discretized 
step-invariant model are displayed in Figure 6.21, with ‘x’ denoting the discretized 
response. 


Figure 6.21 
Discretization using the 
step-invariant method. 


34 


35 
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Step response 











For a given value of T the matrices G and H may be determined by the step-invariant 
method using the MATLAB function c2d (continuous to discrete). Thus, for the 
system of Example 6.32 with T= 0.2, the commands 


A= \0,,dle=2,=3)l 5 


ij s Е 
[GAH P= с2а (7670502) 
return 
& e 0.9971. 0.1494 
=O, 2965 O.5219) 
H = 00329 
SOINI2ISIGIS 


which checks with the answers given in Example 6.32. 


6.9.3 Exercises 


Using the step-invariant method obtain the discretized 
form of the continuous-time state-equation 


ea PaO Ms 9 
#, 0 -2|»x, [1 
Check your answer using MATLAB for the 


particular case when the sampling period is T= 1. 


An LCR circuit, with L = C= R = 1, may be 
modelled by the continuous-time state-space model 


у= 1 0]х 


(a) Determine the Euler form of the discretized 
state-space model. 


(b) Determine the discretized state-space model 
using the step-invariant method. 
(Hint: Use (6.95) to determine the H matrix.) 
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(c) Using MATLAB plot, for each of the three 
models, responses to a unit step input u(t) = 1 
with zero initial conditions, taking the 
sampling period T= 0.1. 


36 A linear continuous-time system is characterized by 
the state matrix 


«p 


(a) Show that the system is stable. 


(b) Show that the state matrix ofthe corresponding 
Euler discrete-time system is 


PS 1-T T 
-T 1-2T 
(c) Show that stability of the discretized system 


requires 7 « 1. 


37 A simple continuous-time model of a production 
and inventory control system may be represented by 
the state-space model 


: XQ) 
(0) = 

-] 0|») 4% O| | u(t) 
1 0||х;(%) 0 -1||н)(%) 


where x,(f) represents the actual production rate and 
x(t) represents the current inventory level; u(t) 


represents the scheduled production rate, u(t) 
represents the sales rate and k, is a constant 
gain factor. 


(a) Determine, using the step-invariant method, 
the discretized form of the model. Express the 
model in the particular case when the sampling 
period T= 1. 


(b) Suppose the production schedule is determined 
by the feedback policy 


u(kT) =k, — x,(kT) 


where k, is the desired inventory level. The 
system is originally in equilibrium with x,(0) 
equal to the sales rate and x,(0) = k,. At time 
t= 0 the sales rate suddenly increases by 
10%; that is, u,(¢) = 1.1x,(0) for t= 0. Find 
the resulting discrete-time state model, with 
sampling rate Т = 1 and taking k, = Ё. 


(c) Find the response of the given continuous-time 
model, subject to the same feedback control 


policy 
иш(0) = К. – (0) 
and the same initial conditions. 


The exercise may be extended to include simulation 
studies using MATLAB. 


(This exercise is adapted from an illustrative 
problem in William L. Brogan, Modern Control 
Theory, 2™ edition, Prentice-Hall, 1985.) 


6.10 Engineering application: design of discrete-time 


systems 


An important development in many areas of modern engineering is the replacement 
of analogue devices by digital ones. Perhaps the most widely known example is the 
compact disc player, in which mechanical transcription followed by analogue signal 
processing has been superseded by optical technology and digital signal processing. 
Also, as stated in the introduction, DVD players and digital radios are setting new 
standards in home entertainment. There are other examples in many fields of engineering, 
particularly where automatic control is employed. 
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6.10.1 


Figure 6.22 
Amplitude response 
for an ideal low-pass 
filter. 


Figure 6.23 

LCR network for 
implementing a 
second-order 
Butterworth filter. 


Analogue filters 


At the centre of most signal processing applications are filters. These have the effect 
of changing the spectrum of input signals; that is, attenuating components of signals 
by an amount depending on the frequency of the component. For example, an analogue 
ideal low-pass filter passes without attenuation all signal components at frequencies 
less than a critical frequency @ = «c, say. The amplitude of the frequency response 
|G(jq@) | (see Section 5.8) of such an ideal filter is shown in Figure 6.22. 

One class of analogue filters whose frequency response approximates that of the 
ideal low-pass filter comprises those known as Butterworth filters. As well as having 
‘good’ characteristics, these can be implemented using a network as illustrated in 
Figure 6.23 for the second-order filter. 

It can be shown (see M. J. Chapman, D. P. Goodall and N. C. Steele, Signal Processing 
in Electronic Communication, Horwood Publishing, Chichester, 1997) that the transfer 
function G,(s) of the nth-order filter is 


where B,(x)- 5, a,x" 


k=0 


G,(s) = E 


with 


k 
208 _ cos(r - 1) _ X 
x= =, a= |] а = = 
г=1 


O. sinra ” 2n 


Using these relations, it is readily shown that 


2 
0. 
бу(з) = =": — (6.98) 
s“ + (20,5 + 0, 

3 
G(s) = e (6.99) 


3 2 2 
5+ 20,8 +2025 + 0} 


and so on. On sketching the amplitudes of the frequency responses G,( jc), it becomes 
apparent that increasing n improves the approximation to the response of the ideal 
low-pass filter of Figure 6.22. 


IGGe)l 
1 
TU. О We e 
R L 
^ T 
u(t) Ci C5 — y(t) 
Y Y 
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Designing a digital replacement filter 


Suppose that we now wish to design a discrete-time system, to operate on samples 
taken from an input signal, that will operate in a similar manner to a Butterworth filter. 
We shall assume that the input signal u(t) and the output signal y(t) of the analogue filter 
are both sampled at the same intervals T to generate the input sequence {u(kT)} and 
the output sequence {y(kT)} respectively. Clearly, we need to specify what is meant 
by ‘operate in a similar manner’. In this case, we shall select as our design strategy a 
method that matches the impulse response sequence of the digital design with a 
sequence of samples, drawn at the appropriate instants T from the impulse response of 
an analogue ‘prototype’. We shall select the prototype from one of the Butterworth 
filters discussed in Section 6.10.1, although there are many other possibilities. 

Let us select the first-order filter, with cut-off frequency @,, as our prototype. Then 
the first step is to calculate the impulse response of this filter. The Laplace transfer 
function of the filter is 


0, 


С = 
(s) sta, 





So, from (5.71), the impulse response is readily obtained as 
м) = фе (2 0) (6.100) 
Next, we sample this response at intervals T to generate the sequence 
UT) = о.е") 
which on taking the z transform, gives 


Z{h(KT)} = H(z) = 0A 


Finally, we choose H(z) to be the transfer function of our digital system. This means 
simply that the input-output relationship for the design of the digital system will be 


Y(z) = H(z)U(z) 


where Y(z) and U(z) are the z transforms of the output and input sequences {y(kT)} 
and (u(kT)) respectively. Thus we have 


Y(z) = 0, U2) (6.101) 


2-е 


Our digital system is now defined, and we can easily construct the corresponding 
difference equation model of the system as 


E- e  )¥(z) = @,2U(2) 
that is 
zYz)- e " Yo) - 0,200) 


Under the assumption of zero initial conditions, we can take inverse transforms to obtain 
the first-order difference equation model 


yk D) e ^ y(k) 2 au(k 1) (6.102) 
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Figure 6.24 Block 
diagram for the digital 
replacement filter, 

а= ко, В= е. 


A block diagram implementation of (6.102) is shown in Figure 6.24. 


yk) 


u(k) e * RW Е 


+ 


6.10.3 Possible developments 


6.11 


6.11.1 


The design method we have considered is called the impulse invariant technique, 
and is only one of many available. The interested reader may develop this study in 
various Ways: 


(1) 


Q) 
(3) 


(4) 


(5) 


Write a computer program to evaluate the sequence generated by (6.102) with 
0, = 1, and compare with values obtained at the sampling instants for the impulse 
response (6.100) of the prototype analogue filter. 

















Repeat the cond-order Butterworth filter. 

By setting nsfer function of the prototype, and z ^ ei^" 
in the z tr gital design, compare the amplitude of the 
frequency For an explanation of the results obtained, 
see Chapte 

An alterna replace s in the Laplace transfer function 
with 

22-1 

Tz+1 


(this is a process that makes use of the trapezoidal method of approximate 
integration). Design alternative digital filters using this technique, which is 
commonly referred to as the Tustin (or bilinear transform) method (see 
Section 6.11.3). 


Show that filters designed using either of these techniques will be stable provided 
that the prototype design is itself stable. 


Engineering application: the delta operator and 


the 9 transform 


Introduction 


In recent years, sampling rates for digital systems have increased many-fold, and tradi- 
tional model formulations based on the z transform have produced unsatisfactory 
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results in some applications. It is beyond the scope of this text to describe this situation 
in detail, but it is possible to give a brief introduction to the problem and to suggest an 
approach to the solution. For further details see R. M. Middleton and G. C. Goodwin, 
Digital Control and Estimation, A Unified Approach (Prentice Hall, Englewood Cliffs, 
NJ, 1990) or W. Forsythe and R. M. Goodall, Digital Control ( Macmillan, London, 1991). 
The contribution of Colin Paterson to the development of this application is gratefully 
acknowledged. 


The q or shift operator and the ó operator 


In the time domain we define the shift operator q in terms of its effect on a sequence 
{x,} as 


qix; 7 bal 


That is, the effect of the shift operator is to shift the sequence by one position, so that 
the Ath term of the new sequence is the (k + 1)th term of the original sequence. It is then 
possible to write the difference equation 


Уыз + 2Уы + SV 7 ga 7 
as 
q», * 2qy, * 5y,7 qu, - u, 
or 
(q *2q € 5,7 (q- Du, (6.103) 


Note that if we had taken the z transform of the difference equation, with an initially 
quiescent system, we would have obtained 


(z? + 2z + 5)¥(z) = (z — 1)U(z) 


We see at once the correspondence between the time-domain q operator and the 
z-transform operator &. 
The next step is to introduce the 6 operator, defined as 


where A has the dimensions of time and is often chosen as the sampling period T. Note 
that 


- Пур Veer Ув 
бу, = Чї Ук Ун = Ye 
Ук А А 


so that if A = Т еп, in the limit of rapid sampling, 


= ЧУ 
бук di 


Solving for q we see that 


q=1+A6 
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The difference equation (6.103) can thus be written as 
((1 + Ady + 2(1 + Ad) + 5)y, = [C1 + AS) — 1J 
or 
[(Ad)’ + 4A6 + 8], = Adu, 


or, finally, as 


G + d E)» - ou, 


Constructing a discrete-time system model 


So far, we have simply demonstrated a method of rewriting a difference equation in an 
alternative form. We now examine the possible advantages of constructing discrete- 
time system models using the 6 operator. To do this, we consider a particular example, 
in which we obtain two different discrete-time forms of the second-order Butterworth 
filter, both based on the bilinear transform method, sometimes known as Tustin’s 
method. This method has its origins in the trapezoidal approximation to the integra- 
tion process; full details are given in M. J. Chapman, D. P. Goodall and N. C. Steele, 
Signal Processing in Electronic Communication (Horwood Publishing, Chichester, 
1997). 

The continuous-time second-order Butterworth filter with cut-off frequency œ, = 1 
is modelled, as indicated by (6.98), by the differential equation 


dy dy 
1.414 21 = 6.104 
A У ку = щй) (6.104) 


where u(t) is the input and y(t) the filter response. Taking Laplace transforms through- 
out on the assumption of quiescent initial conditions, that is y(0) — (dy/df)(0) 2 0, we 
obtain the transformed equation 


(52+ 1.414 21s + 1)Y(s) 2 U(s) (6.105) 
This represents a stable system, since the system poles, given by 
s°+1.41421s+1=0 


are located at s = —0.707 10 + j0.707 10 and thus lie in the left half-plane of the complex 
s plane. 

We now seek a discrete-time version of the differential equation (6.104). To do this, 
we first transform (6.105) into the z domain using the bilinear transform method, 
which involves replacing s by 

2z-1 
Tz+1 


Equation (3.74) then becomes 


E & Da 1.414 218 (= D. ¥(z) = U(z) 


T?\z+1 2 
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or 

[0272 + 1.41421 x į T+ 42 + (27° – 8)2 + 17° – 1.41421 x 1 T+4]Y(2) 

= }ТҢ(ф? + 22 + 1)Ш() (6.106) 
We can now invert this transformed equation to obtain the time-domain model 

GT? + 1.41421х ІТ+ 4)уь + (27? – 8)уь * (17? - 1.4421 x IT 4 4)y, 

= }ТҖиш + 2йь + щ) (6.107) 
For illustrative purposes we set T — 0.1 s in (6.107) to obtain 

4.07321y,,; = 7.995 00у, + 3.93179), = 0.025 00(и,., + 20. + и) 


Note that the roots of the characteristic equation have modulus of about 0.9825, and are 
thus quite close to the stability boundary. 
When T= 0.01 s, (6.107) becomes 


4.007 10у – 7.99995), + 3.992 95у, = 0.000 03(и. + 2и + щ) 


In this case the roots have modulus of about 0.9982, and we see that increasing the 
sampling rate has moved them even closer to the stability boundary, and that high 
accuracy in the coefficients is essential, thus adding to the expense of implementation. 

An alternative method of proceeding is to avoid the intermediate stage of obtaining 
the z-domain model (6.106) and to proceed directly to a discrete-time representation 
from (6.104), using the transformation 


294-1 
Та+1 


leading to the same result as in (6.107). Using the 6 operator instead of the shift operator 
q, noting that q = 1 + Ad, we make the transformation 


2_A6 
T2+A6 





or, if T= A, the transformation 


26 
2+A6 





=> 


in (6.105), which becomes 
[9° + 1.41421 x $6(2 + Ad) + 3 (2 + Ad)’ ]y, = 5 (2+ Ad)u, 


Note that in this form it is easy to see that in the limit as A > 0 (that is, as sampling 
becomes very fast) we regain the original differential equation model. Rearranging this 
equation, we have 


8s (1.414 21+A) 54 1 т 
k 
(1 141421 x IA - 1A") (1+ 1.41421 x JA € IA?) 
2 
ЛОЖА. (6.108) 


A(1-- L414 21 x IA - 1A?) 
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Figure 6.25 
The 5" block. 


In order to assess stability, it is helpful to introduce a transform variable y associated 
with the ô operator. This is achieved by defining y in terms of z as 


_ 2—1 


PN 


The region of stability in the z plane, |z| < 1, thus becomes 
[1 + Ау] < 1 


or 
la + y га (6.109) 
^ ^ 


This corresponds to a circle in the y domain, centre (—1/A, 0) and radius 1/A. As 
A — 0, we see that this circle expands in such a way that the stability region is the 
entire open left half-plane, and coincides with the stability region for continuous-time 
systems. 

Let us examine the pole locations for the two cases previously considered, namely 
Т= 0.1 and T= 0.01. With A= T= 0.1, the characteristic equation has the form 


7 + 1.410927 + 0.93178 = 0 


with roots, corresponding to poles of the system, at —0.705 46 + 0.658 87. The centre 
of the circular stability region is now at —1/0.1 = —10, with radius 10, and these roots 
lie at a radial distance of about 9.3178 from this centre. Note that the distance of 
the poles from the stability boundary is just less than 0.7. The poles of the original 
continuous-time model were also at about this distance from the appropriate boundary, 
and we observe the sharp contrast from our first discretized model, when the discretiza- 
tion process itself moved the pole locations very close to the stability boundary. In 
that approach the situation became exacerbated when the sampling rate was increased, 
to T= 0.01, and the poles moved nearer to the boundary. Setting Т = 0.01 in the new 
formulation, we find that the characteristic equation becomes 


y? + 1.414 13y + 0.99295 =0 


with roots at —0.707 06 + j0.70214. The stability circle is now centred at —100, with 
radius 100, and the radial distance of the poles is about 99.2954. Thus the distance from 
the boundary remains at about 0.7. Clearly, in the limit as A — 0, the pole locations 
become those of the continuous-time model, with the stability circle enlarging to 
become the entire left half of the complex y plane. 


Implementing the design 


The discussion so far serves to demonstrate the utility of the 6 operator formulation, but 
the problem of implementation of the design remains. It is possible to construct a 57! 
block based on delay or 1/z blocks, as shown in Figure 6.25. Systems can be realized 





552 THE Z TRANSFORM 


using these structures in cascade or otherwise, and simulation studies have produced 
successful results. An alternative approach is to make use of the state-space form of 
the system model (see Section 6.18). We demonstrate this approach again for the case 
T= 0.01, when, with T= A = 0.01, (6.108) becomes 


(6° + 1.414 136 + 0.992 95)y, 

= (0.000 0267 + 0.009 306 + 0.992 95)u, (6.110a) 
Based on (6.110a) we are led to consider the equation 

(6° + 1.414 136 + 0.992 95)p, = u, (6.110b) 
Defining the state variables 

Xk = Pi Xy, 7 Óp, 
equation (6.110b) can be represented by the pair of equations 

Óxi, 7 X4. 

Óx;, — —0.992 95x, , — 1.414 13x; + ug 
Choosing 

yy 7 0.992 95p, -- 0.009 306p, 4- 0.000 0025p, (6.110c) 


equations (6.110b) and (6.110c) are equivalent to (6.1102). In terms of the state 
variables we see that 


y, 7 0.992 93x, , - 0.009 72x; , + 0.000 02и, 


Defining the vectors x, = [xy x4]! and óx, 2 [Óx,,  óx;,]', equation (6.111a) can be 
represented in matrix form as 


Ox, = 0 l Xx, " ик (6.111а) 
-0.992 95 -1.41413 1 
with 
y, = [0.99293 0.009 72]x;, + 0.000 02u, (6.111b) 


We now return to the q form to implement the system. Recalling that 6 = (q — 1)/A, 


(6.111a) becomes 
ах = х = t А E l XQ 9 ик (6.112) 
-0.99295 -1.41413 1 


with (6.111b) remaining the same and where A = 0.01, in this case. Equations (6.112) 
and (6.111b) may be expressed in the vector-matrix form 


хь = ху + Д[А(Л)х, + Би] 


у= c'(A)x; * d(A)u, 
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This matrix difference equation can now be implemented without difficulty using 
standard delay blocks, and has a form similar to the result of applying a simple Euler 
discretization of the original continuous-time model expressed in state-space form. 


The 9 transform 


In Section 6.11.3 we introduced a transform variable 





The purpose of this was to enable us to analyse the stability of systems described in the 
6 form. We now define a transform in terms of the z transform using the notation given 
by R. M. Middleton and G. C. Goodwin, Digital Control and Estimation, A Unified 
Approach (Prentice Hall, Englewood Cliffs, NJ, 1990). Let the sequence { f,} have z 
transform F(z); then the new transform is given by 


FU) =F@Olaayn 
А 
У Ap 


The & transform is formally defined as a slight modification to this form, as 


Df) = FY) = APY) 


=f 
= лу 
АТ 


k-0 


The purpose of this modification is to permit the construction of a unified theory of 
transforms encompassing both continuous- and discrete-time models in the same 
structure. These developments are beyond the scope of the text, but may be pursued 
by the interested reader in the reference given above. We conclude the discussion 
with an example to illustrate the ideas. The ramp sequence {и} = {АА} сап Бе 
obtained by sampling the continuous-time function f(f) = t at intervals A. This sequence 
has z transform 


Az 
(2-1) 


and the corresponding & transform is then 


АОДу) = LAY 


2 


U(z) = 





Note that on setting A = 0 and y = s one recovers the Laplace transform of f(f). 
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6.11.6 Exercises 


38 A continuous-time system having input y(t) and 
output y(t) is defined by its transfer function 


I 
(з + 1)(5+2) 


Use the methods described above to find the q and 
6 form of the discrete-time system model obtained 
using the transformation 


H(s) = 


22-1 41 


А2+1 





where A is the sampling interval. Examine the 
stability of the original system and that of the 
discrete-time systems when A = 0.1 and when 
А = 0.01. 


39 Use the formula in equation (6.99) to obtain the 
transfer function of the third-order Butterworth 
filter with @, = 1, and obtain the corresponding 
6 form discrete-time system when T= A. 


40 . Make the substitution 


xit) 7 y) 
х0) = 90 


in Exercise 38 to obtain the state-space form of the 
system model, 


X(t) = Ax(t) + bu(t) 


y(t) = e'x(t) + du(t) 


The Euler discretization technique replaces x(t) by 


X((k-- 1)A) - x(kA) 
А 


Show that this corresponds to the model obtained 
above with A = A(0), c = c(0) and d=d(0). 


The discretization procedure used in Section 6.11.3 
has been based on the bilinear transform method, 
derived from the trapezoidal approximation to the 
integration process. An alternative approximation 
is the Adams—Bashforth procedure, and it can be 
shown that this means that we should make the 
transformation 


12 z-z 


5 5 
А 52 + 82-1 


where A is the sampling interval (see W. Forsythe 
and R. M. Goodall, Digital Control, Macmillan, 
London, 1991). Use this transformation to 
discretize the system given by 


HS m 
(s) sl 


when A= 0.1 in 


(a) the z form, and 
(b) the y form. 


6.12 Review exercises (1-18) 


Check your answers using MATLAB or MAPLE whenever possible. 


1  Thesignal/() — t is sampled at intervals T to 4 
generate the sequence ( /(kT)]. Show that 


t f(kr)) - —— 
(z-1) 
2 Show that 5 


az sin @ 


Fa‘ sinko} = (a > 0) 


z-2azcos@ta’ 
3 Show that 


FRY -z(zt ш 
(2500) 


Find the impulse response for the system with 
transfer function 


2 
DM. 


Jefu) m 2 
2 -22+1 


Calculate the step response for the system with 
transfer function 


1 


О) = ER 
2 +32+2 


A process with Laplace transfer function 
H(s) = 1/(s + 1) is in cascade with a zero-order 
hold device with Laplace transfer function 


G(s) — (1 — e?" )/s. The overall transfer function 
is then 

qs co 

s(s- 1) 
Write F(s) = 1/s(s + 1), and find f() = ! (F(s)). 
Sample f(t) at intervals T to produce the 
sequence ( /(kT )) and find F(z) = F{ f(kT)}. 
Deduce that 


e*F( 5 Lf() 
7A 


and hence show that the overall z transfer function 
for the process and zero-order hold is 





A system has Laplace transfer function 


Serl 


HOUR SS 


Calculate the impulse response, and obtain the 
z transform of this response when sampled at 
intervals T. 


It can be established that if X(z) is the z transform 
of the sequence {x,} then the general term of that 
sequence is given by 


ЭЕ 2л 
С 


х= ij Х(2)2" d 


where C is any closed contour containing all 

the singularities of X(z). If we assume that all the 
singularities of X(z) are poles located within a circle 
of finite radius then it is an easy application of the 
residue theorem to show that 


X, = È [residues of X(z)z”" at poles of X(z)] 


(a) Let X(z) 2 z/(z — a)(z — b), with a and b real. 
Where are the poles of X(z)? Calculate the 
residues of z” \X(z), and hence invert the 
transform to obtain {х, }. 

(b) Use the residue method to find 


3 21 2 ue 21 2 
(0) | (i) € Ll 


The impulse response of a certain discrete-time 
system is ((-1)' — 2^1. What is the step response? 





10 


Пе 


12 
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A discrete-time system has transfer function 


2 


DE i E 
О уй; 


Find the response to the sequence (1, —1, 0,0, .. . 3. 


Show that the response of the second-order 
system with transfer function 


2 
2 


(z- a)(z - B) 
to the input (1, -(@+ B), a@B, 0, 0, 0,...} is 
{Op UIE OS Olean 
Deduce that the response of the system 
2 
(z - a) - p) 
to the same input will be 
Оа =O, 0,0, aco} 


A system is specified by its Laplace transfer 
function 


H(s) ———À—— 
(Sl) (see) 
Calculate the impulse response y;(f) 2 ^! (H(s)), 
and show that if this response is sampled at 
intervals T to generate the sequence {ys(nT)} 
(и ОЗІ) неп 


222 


=27 eu 
але Es 





D(z) 2 Zlys(nT)) - 


A discrete-time system is now constructed so that 


Y(z) - TD(z)X(z) 


where X(z) is the z transform of the input 
sequence {x,} and Y(z) that of the output 
sequence {y,}, with x, 2 x(nT) and y, — (nT). 
Show that if T = 0.5 s then the difference 
equation governing the system is 


Me 0.9744y,41 + 0.223 ly, 
= 0.50.2 0.4226% 041 


Sketch a block diagram for the discrete-time 
system modelled by the difference equation 


Dua — 0.9744p,,, 4 0.2231р, = х, 


and verify that the signal y,, as defined above, is 
generated by taking y, = 0.5p,,,. — 0.4226p,,,; as 
output. 
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15) 


14 


In a discrete-time position-control system the 


position y, satisfies the difference equation 
Уу =, + av, (a constant) 

where v, and u, satisfy the difference equations 
Ол = 0, bu, (b constant) 

и, = k(x, —y,) ^ kv, (ki, ka constants) 


(a) Show that if k, 2 1/4ab and k, = 1/b then the 
z transfer function of the system is 


Hr eS 
Х(2) (1-2zy 


where Y(z) 2 Z(y,) and X(z) = F{x,}. 








illi 
(b) If also x, 2 4 (where А is a constant), 
determine the response sequence {y,,} given 
that yy — y; — 0. 
The step response of a continuous-time system is 
modelled by the differential equation 
2 

dy 43425 = 1 ((>®0) 

dt dt 
with y(0) 2 »(0) 2 0. Use the backward-difference 
approximation 

dy || Ya Ув 

dt T 

d'y we VET Ve + Yen 

аг ls 
to show that this differential equation may be 
approximated by 

Ve 20 tyro "S кс 2y, 71 

in 
Take the z transform of this difference equation, 
and show that the system poles are at 
1 1 
Z2 —, z- 
TE ПЕ 2 
Deduce that the general solution is thus 
1 k ( 1 J 
= 20| —— | + + 
a ines) PTa) Б 


Show that y = i and, noting that the initial 


conditions y(0) — 0 and y(0) — 0 imply 
Yo 7 = 0, deduce that 


i 133 iL We 
2l See 
a a Gear с! 





Note that the z-transform method could be used to 
obtain this result if we redefine Z( y,) — 377 4(y/z/), 
with appropriate modifications to the formulae for 
0) апа 34у). 

Explain why the calculation procedure is 
always stable in theory, but note the pole 
locations for very small T. 

Finally, verify that the solution of the 
differential equation is 


yA) = 3 (€% — 2e% + 1) 


and plot graphs of the exact and approximate 
solutions with T= 0.1 s and T — 0.05 s. 


Again consider the step response of the system 
modelled by the differential equation 
d'y | 4dy 
шз с tz 
mom y ( ) 
with y(0) = »(0) 2 0. Now discretize using the 


bilinear transform method; that is, take the 
Laplace transform and make the transformation 


DE Il 
Tari 


=> 


where 7 15 the sampling interval. Show that the 
poles of the resulting z transfer function are at 


1-T 
z = —,İ 
1+Т 


PIER 
с = 
Е ТИ 
Deduce that the general solution is then 
1-тү 2-тү 
= | | + В| =— | + 
a aem) Ben) r 


Deduce that y — 1 and, using the conditions 
Ya =Y 7 0, show that 


sentent] 


Plot graphs to illustrate the exact solution and 
the approximate solution when 7 — 0.1 s and 
T — 0,05 s. 


Show that the z transform of the sampled version 
of the signal f(f) ^ t? is 


F(z) - DA 
(z - 1) 


where A is the sampling interval. Verify that 
the 9 transform is then 


ПЕ 


18 


(1 Av)(2 * Av) 


3 
v 


Show that the eigenvalues of the matrix 


1 Б? 
A--1 2 1 
0 =Íl 


are 2, 1 and —1, and find the corresponding 

eigenvectors. Write down the modal matrix M and 

spectral matrix A of A, and verify that MA = AM. 
Deduce that the system of difference equations 


x(k + 1) = Ax(k) 
where x(k) = [x,(k) x(k) x,(4)]', has a solution 
x(k) = My(4) 
where y(k) = A*y(0). Find this solution, given 
x(0)=[1 0 oO]. 


The system shown in Figure 6.26 is a realization 
of a discrete-time system. Show that, with state 
variables x,(k) and x,(k) as shown, the system may 
be represented as 





Figure 6.26 Discrete-time system of Review 
exercise 19. 
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x(k + 1) = Ax(k) + bu(k) 
у(®) = с'х(®) 


where 


A- =3 E Г |, rs 1 
-2 -1 0 -1 
Calculate the z transfer function of the system, 
D(z), where 
D(z) = с(21– Ау'Ь 


Reduce the system to control canonical form by 
the following means: 


(i) calculate the controllability matrix M,, where 
M.=[6 Ab] is the matrix with columns 5 
and Ab; 


(ii) show that rank (M)) = 2, and calculate M;'; 


Gii) write down the vector v" corresponding to 
the last row of M;'; 


(iv) form the matrix T= [27 v" AT, the matrix 
with rows v' and v'A; 


(v) calculate T ' and using this matrix T, 
show that the transformation (А) = Тх(/) 
produces the system 


z(k * 1) 2 TAT"'z() - Tbu() 
= C2(k) + bulk) 


where C is of the form 


S 


and b,=[0  1]". Calculate o and f, and 
comment on the values obtained in relation 
to the transfer function D(z). 
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fe 


Introduction 


The representation of a function in the form of a series is fairly common practice in 
mathematics. Probably the most familiar expansions are power series of the form 


fx) = Yay" 


n=0 
in which the resolved components or base set comprise the power functions 


1,x,x2,33,..., x... 
For example, we recall that the exponential function may be represented by the infinite 
series 

x x x oo x 
e Sd үнген а у с 
п=0 
There are frequently advantages in expanding a function in such a series, since the first 
few terms of a good approximation are easy to deal with. For example, term-by-term inte- 
gration or differentiation may be applied or suitable function approximations can be made. 
Power functions comprise only one example of a base set for the expansion of func- 

tions: a number of other base sets may be used. In particular, a Fourier series is an 
expansion of a periodic function f(t) of period T= 21/0 in which the base set is the set 
of sine functions, giving an expanded representation of the form 


f() — A V. A, sin(uot ф,) 


n-l 


Although the idea of expanding a function in the form of such a series had been used 
by Bernoulli, D'Alembert and Euler (c. 1750) to solve problems associated with the 
vibration of strings, it was Joseph Fourier (1768—1830) who developed the approach to 
a stage where it was generally useful. Fourier, a French physicist, was interested in 
heat-flow problems: given an initial temperature at all points of a region, he was con- 
cerned with determining the change in the temperature distribution over time. When 
Fourier postulated in 1807 that an arbitrary function f(x) could be represented by a 
trigonometric series of the form 
У (A, cos nkx + B, sin nkx) 


n=0 


the result was considered so startling that it met considerable opposition from the 
leading mathematicians of the time, notably Laplace, Poisson and, more significantly, 
Lagrange, who is regarded as one of the greatest mathematicians of all time. They ques- 
tioned his work because of its lack of rigour, and it was probably this opposition that 
delayed the publication of Fourier’s work, his classic text Théorie Analytique de la 
Chaleur (The Analytical Theory of Heat) not appearing until 1822. This text has since 
become the source for the modern methods of solving practical problems associated 
with partial differential equations subject to prescribed boundary conditions. In addi- 
tion to heat flow, this class of problems includes structural vibrations, wave propagation 
and diffusion, which are discussed in Chapter 9. The task of giving Fourier’s work a 


7.2 


7.2.1 
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more rigorous mathematical underpinning was undertaken later by Dirichlet (c. 1830) 
and subsequently Riemann, his successor at the University of Góttingen. 

In addition to its use in solving boundary-value problems associated with partial 
differential equations, Fourier series analysis is central to many other applications in 
engineering. In Chapter 5 we saw how the frequency response of a dynamical system, 
modelled by a linear differential equation with constant coefficients, is readily determined 
and the role that it plays in both system analysis and design. In such cases the frequency 
response, being the steady-state response to a sinusoidal input signal A sin wt, is also a 
sinusoid having the same frequency as the input signal. As mentioned in Section 5.5.6, 
periodic functions, which are not purely sinusoidal, frequently occur as input signals in 
engineering applications, particularly in electrical engineering, since many electrical 
sources of practical value, such as electronic rectifiers, generate non-sinusoidal periodic 
waveforms. Fourier series provide the ideal framework for analysing the steady-state 
response to such periodic input signals, since they enable us to represent the signals as 
infinite sums of sinusoids. The steady-state response due to each sinusoid can then be 
determined as in Section 5.8, and, because of the linear character of the system, the 
desired steady-state response can be determined as the sum of the individual responses. 
As the Fourier series expansion will consist of sinusoids having frequencies næ that are 
multiples of the input signal frequency o, the steady-state response will also have com- 
ponents having such frequencies. If one of the multiple frequencies n@ happens to be 
close in value to the natural oscillating frequency of the system, then it will resonate with 
the system, and the component at this frequency will dominate the steady-state response. 
Thus a distinction of significant practical interest between a non-sinusoidal periodic input 
signal and a sinusoidal input signal is that although the signal may have a frequency 
considerably lower than the natural frequency of the system, serious problems can still 
arise owing to resonance. A Fourier series analysis helps to identify such a possibility. 

In Chapter 8 we shall illustrate how Fourier series analysis may be extended to 
aperiodic functions by the use of Fourier transforms. The discrete versions of such 
transforms provide one of the most advanced methods for discrete signal analysis, 
and are widely used in such fields as communications theory and speech and image 
processing. Applications to boundary-value problems are considered in Chapter 9. 


Fourier series expansion 


In this section we develop the Fourier series expansion of periodic functions and dis- 
cuss how closely they approximate the functions. We also indicate how symmetrical 
properties of the function may be taken advantage of in order to reduce the amount 
of mathematical manipulation involved in determining the Fourier series. First the 
properties of periodic functions are briefly reviewed. 


Periodic functions 


A function f(f) is said to be periodic if its image values are repeated at regular intervals 
in its domain. Thus the graph of a periodic function can be divided into ‘vertical strips’ 
that are replicas of each other, as illustrated in Figure 7.1. The interval between two 
successive replicas is called the period of the function. We therefore say that a function 
J( is periodic with period T if, for all its domain values t, 
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Figure 7.1 A periodic 
function with period 7. 


7.2.2 


f) 












f 





| 
| 
| 
| 


| 
| 
| 
| 
i 
| 
=. One period ——>—— One period —> 


f(t + mT) =f) 


for any integer m. 
To provide a measure of the number of repetitions per unit of ¢, we define the frequency 
of a periodic function to be the reciprocal of its period, so that 





frequency = == 
: К period Т 


The term circular frequency is also used in engineering, and is defined by 
circular frequency = 27 x frequency = z 


and is measured in radians per second. It is common to drop the term ‘circular’ and refer 
to this simply as the frequency when the context is clear. 


Fourier's theorem 


This theorem states that a periodic function that satisfies certain conditions can be 
expressed as the sum of a number of sine functions of different amplitudes, phases and 
periods. That is, if /(f) is a periodic function with period 7 then 


f(t) = Apt A; sin(@t+ o,) + A, sin(2@t+ ф,) +... 
+A,sin(not+ ф,) +... (7.1) 


where the As and $s аге constants and œ = 2m/T is the frequency of f(t). The term 

A, sin(@t + ф,) is called the first harmonic or the fundamental mode, and it has the 

same frequency @ as the parent function f(t). The term A, sin(n@t + @,) is called the 

nth harmonic, and it has frequency no, which is n times that of the fundamental. A, 

denotes the amplitude of the nth harmonic and 6, 1s its phase angle, measuring the lag 

or lead of the nth harmonic with reference to a pure sine wave of the same frequency. 
Since 


A, sin(not + @,) = (A, cos @,)sin not + (A, sin @,) cos nat 
= 5, sinnot + a, cos not 
where 
b, = A, COS Q,, a, = A, sin >, (7.2) 


the expansion (7.1) may be written as 
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f(t) 7 $a, * D a, cos nQt + > b, sin not (7.3) 


n-l п=1 


where a, — 24, (we shall see later that taking the first term as KA rather than a, is a 
convenience that enables us to make ay fit a general result). The expansion (7.3) is called 
the Fourier series expansion of the function f(A), and the as and bs are called the Fourier 
coefficients. In electrical engineering it is common practice to refer to a, and b, respect- 
ively as the in-phase and phase quadrature components of the nth harmonic, this 
terminology arising from the use of the phasor notation e”®% = cos not - jsinnot. 
Clearly, (7.1) is an alternative representation of the Fourier series with the amplitude 
and phase of the nth harmonic being determined from (7.2) as 


Ay= eR). $c ta (S2) 


with care being taken over choice of quadrant. 
The Fourier coefficients are given by 


d+T 
asal f(t) cos not dt (n=0,1,2,...) (7.4) 

d 

2 d+T 
1-2) f@sinnotdt (1=1,2,3,...) (7.5) 

d 


which are known as Euler’s formulae. 
Before proceeding to verify (7.4) and (7.5), we first state the following integrals, in 
which Т = 27/0: 


ат 

| cos not dt = 9 pne (7.6) 
4 T (n=0) 
d+T 

| sinno@tdt=0 (alln) (7.7) 
d 
veg | 0 (msn) 

| sin mot sin not dt = {ir meast (7.8) 


T : 0 (т=п) a 
t tdt — 2 
cos m Oft cos no IT (nem dn (7.9) 


d+T 
| cosmaOtsinnotdt — 0 (all m and n) (7.10) 


d 


The results (7.6)-(7.10) constitute the orthogonality relations for sine and cosine 
functions, and show that the set of functions 


(1, cos ot, cos20t, ... , cosnmt, sin @t, sin2@t, ..., sinna@r} 


is an orthogonal set of functions on the interval d S t S d + T. The choice of d is 
arbitrary in these results, it only being necessary to integrate over a period of duration T. 
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Integrating the series (7.3) with respect to ¢ over the period t= d to t = d + T, and 
using (7.6) and (7.7), we find that each term on the right-hand side is zero except for 
the term involving a; that is, we have 


ат ат 25 d+T d+T 
| лод) es] cos mana b| sin nora 
n=1 


d d d d 
= lao(T) + X° [a,(0) +b,(0)] 
n=1 
= }Tay 


Thus 
1 d+T 
22) f(t) dt 


and we can see that the constant term } a, in the Fourier series expansion represents the 
mean value of the function /(‘) over one period. For an electrical signal it represents the 
bias level or DC (direct current) component. Hence 
d+T 
Ges | f(t) dt (7.11) 
T d 

To obtain this result, we have assumed that term-by-term integration of the series (7.3) 
is permissible. This is indeed so because of the convergence properties of the series — 
its validity 1s discussed in detail in more advanced texts. 

To obtain the Fourier coefficient a, (n z 0), we multiply (7.3) throughout by cos mat 
and integrate with respect to ¢ over the period f= d to t=d + T, giving 


ат d+T = ат 
| f(f) cos mot dt — «| cos mat dt + у «| cos nat cos mat dt 
d d n-l d 
Еа а+т 
+ x «| cos mot sin nat dt 
n-l 


d 


Assuming term-by-term integration to be possible, and using (7.6), (7.9) and (7.10), we 
find that, when m # 0, the only non-zero integral on the right-hand side is the one that 
occurs in the first summation when n = m. That is, we have 


d+T d+T 
| f(t) cos mat dt = «| cos mat cos mat dt = 1a,T 


d d 


giving 


d+T 
а= l f(0) cos mot dt 


d 


which, on replacing m by n, gives 


d+T 
a,= l f(t) cos not dt (7.12) 
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The value of a, given in (7.11) may be obtained by taking n = 0 in (7.12), so that we 
may write 


d+T 
а, = 2f f(t) cosmotdt (n=0,1,2,...) 
d 

which verifies formula (7.4). This explains why the constant term in the Fourier series 
expansion was taken as 5а) and not ay, since this ensures compatibility of the results 
(7.11) and (7.12). Although a, and a, satisfy the same formula, it is usually safer to 
work them out separately. 

Finally, to obtain the Fourier coefficients b,, we multiply (7.3) throughout by 
sin mot and integrate with respect to t over the period t= d to t = d + T, giving 


а+т а+т 
| f(t) sin mat dt = | sin mot dt 
d d 


d+T 
sin mof sin nora 


d+T 
+ >, laf sin mat cos not dt + i| 
n-l 


d t 


Assuming term-by-term integration to be possible, and using (7.7), (7.8) and (7.10), we 
find that the only non-zero integral on the right-hand side is the one that occurs in the 
second summation when m = n. That is, we have 


а+т a+T 
| f(0 sin mot dt — i. sin mot sin mat dt = 1 b,T 
d d 
giving, on replacing m by n, 


"T 


d 


d+T 
b 2 f(sinnotdt (n=1,2,3,...) 
which verifies formula (7.5). 


Summary 


In summary, we have shown that if a periodic function f(f) of period T — 21/0 can 
be expanded as a Fourier series then that series is given by 


f(D) = iay* у a, COS n@t + У b, sin not (7.3) 


n=l n=1 


where the coefficients are given by the Euler formulae 


d+T 


2-2) f()cosnotdt (n0, 1,2, ...) (7.4) 


aon 
n=] fi) sinnotdt (n=1,2,3,...) CS 
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7.2.3 


The limits of integration in Euler's formulae may be specified over any period, so that 
the choice of d is arbitrary, and may be made in such a way as to help in the calculation of 
a, and b,. In practice, it is common to specify f(t) over either the period 1 T — t < } Tor 
the period 0 < t < T, leading respectively to the limits of integration being -i T and 
1T (that is, d — —; T) or 0 and T (that is, d = 0). 

It is also worth noting that an alternative approach may simplify the calculation of 
a, and b,. Using the formula 


e” = cosnat + jsinnat 


we have 
2 d+T 
аер 2 fer dt (7.13) 
d 


Evaluating this integral and equating real and imaginary parts on each side gives the 
values of a, and b,. This approach is particularly useful when only the amplitude 
|a,,+ jb,,| of the nth harmonic is required. 


Functions of period 2x 


If the period T of the periodic function /(f) 1s taken to be 2r then 0 = 1, and the 
series (7.3) becomes 


f= fat D d, COS nf -- у b, sin nt (7.14) 
й=1 n=1 
with the coefficients given by 
1 d+27 
a= 1| Jesse (om 0 l2. 55) (7.15) 
d 
1 арп 
D 1| Ду зш — (quo ll 2 essc) (7.16) 
d 


While a unit frequency may rarely be encountered in practice, consideration of this par- 
ticular case reduces the amount of mathematical manipulation involved in determining 
the coefficients a, and 5,. Also, there is no loss of generality in considering this case, 
since if we have a function /(/) of period T, we may write f, — 2nt/T, so that 


sO =)=% 


where F(t,) is a function of period 2m. That is, by a simple change of variable, a periodic 
function f(f) of period T may be transformed into a periodic function F(t,) of period 21. 
Thus, in order to develop an initial understanding and to discuss some of the properties 
of Fourier series, we shall first consider functions of period 2m, returning to functions 
of period other than 27 in Section 7.2.7. 
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Example 7.1 Obtain the Fourier series expansion of the periodic function f(t) of period 2x defined by 
f=t O<t<2n), f=At+2n) 


Figure 7.2 Sawtooth fo 
wave of Example 7.1. 








Solution A sketch of the function f(f) over the interval —4m < t « 4r is shown in Figure 7.2. 
Since the function is periodic we only need to sketch it over one period, the pattern 
being repeated for other periods. Using (7.15) to evaluate the Fourier coefficients ay and 
à, gives 


1 2n 1 2n 1 Р 2n 
«1l лда) uli = 20 
T T т|2 


0 0 0 


апа 


& 
|| 
alr 


| f(cosntdt (n=1,2,...) 


0 


1 2n 
== f cos nt dt 
T 0 


which, on integration by parts, gives 


> 2n 
а, = к= = 1(21 sin 2nn + + cos 2nm - » =0 


2 
T n n n n n 


since sin 27x — 0 and cos 2nm = cos 0 = 1. Note the need to work out a, separately from 
a, in this case. The formula (7.16) for РБ, gives 


ale 


2n 
b, = | f(@sinntdt (n=1,2,...) 


0 


1 2x 
= t sin nt dt 
T 0 


which, on integration by parts, gives 


2 


1 : 2n 
b, = 5| -L cos nt + H 
T| n n 


0 


1-21 cos т) (since sin 2am = sin 0 = 0) 
TV n 


=—= (since cos 2nt = 1) 
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Hence from (7.14) the Fourier series expansion of f(t) is 


со 


fQ-x- Y. 2 sin nf 


n-l 
or, in expanded form, 


Д) =п-45пг+ 30205300. sinn) 
n 


Example 7.2 A periodic function f(t) with period 27 is defined by 
f=C +t (a<t<n), = + 2л) 


Sketch a graph of the function f(t) for values of ¢ from ¢ = —3r to t 2 3r and obtain a 
Fourier series expansion of the function. 





Figure 7.3 Graph of fo 
the function f(t) of 
Example 7.2. i; 
і і І ' 
1 
A t t I» 
~3n  -2a -х О л 2л 3n 1 


Solution A graph of the function f(t) for -3m < t < 3m is shown in Figure 7.3. From (7.15) 
we have 


«il fou -1l (+)йг=?т^ 
" -т " -T 


and 


ale 


a, = 


| f(t) cosntdt (n=1,2,3,...) 


T 
| (f 4 f) cos nt dt 
-T 


аҥ 


which, on integration by parts, gives 


I 
a, = 
T 


T 
4 . I 
5 COS NT [ine sin nt = 0 and E cos " = J 
n 
-1 


N 


~ 


T 
sin nt + 2! cos nt- = sin nt4 É sin nt +4 cos nt 
n n n n с. 


S | 


= By (since cos nt = (—1)") 
n 
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From (7.16) 


i-i f(tsinntdt (n=1,2,3,...) 
TC 


ali 


T 
| (f 4 f) sin nt df 
—n 
which, on integration by parts, gives 
2 T 
b,-- -Lcos nt + H sin nt+ Z cos nt - Écosnt+ 4 sin nt 
T| n n n n n 
-T 
_ 2 _ 2 n . _ А 
= —Źcos nt — —*(-1) (5іпсе соѕит = (-1)") 
п п 
Hence from (7.14) the Fourier series expansion of f(t) is 


f() 7 in «Y S caycosnt- Y. 2(-1)"sin nt 
n-l 


n-l 


or, in expanded form, 


fit) = in? + a{—cost + 90828 — 29835... | + 2f sing SRA SRS 
2 3 2 3 


To illustrate the alternative approach, using (7.13) gives 


а, t jb, = 


ale 


| f(pje dt = | (£ 4 t) e" dr 


2 . T T | 
= all = i -Í 21+ torar 
T jn a1 


T 
: +t ine 211 т та 


jn GnY Gn, 





ale 


Since 
e"™ = cosnn + jsinnn = (-1)" 
e?" — cos n —jsinnn = (-1)" 
and 
1/j 2 —j 


n 2 : 2 : 
a, i jb, = CU (jE r,a ijon doae 2) 
T 


3 
n n n n n n 


-с0(2-0) 
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Equating real and imaginary parts gives, as before, 


4 п 2 п 
o a рее 
n n 


A periodic function f(t) may be specified in a piecewise fashion over a period, or, 
indeed, it may only be piecewise-continuous over a period, as illustrated in Figure 7.4. 
In order to calculate the Fourier coefficients in such cases, it is necessary to break up 
the range of integration in the Euler formulae to correspond to the various components 


of the function. For example, for the function shown in Figure 7.4, f(¢) is defined in the 
interval =r < t < n by 


Л) (r< t< -p) 
fO=) AQ) (p<t<q) 
BM) (q«t«m) 


and is periodic with period 27. The Euler formulae (7.15) and (7.16) for the Fourier 
coefficients become 


«1 | p cosmat» | feo cosmar 


-P q 


„=1|| позна) позна) fo sinnar 


-p 


4 


Figure 7.4 Piecewise- 
continuous function 
over a period. 





Example 7.3 A periodic function f(#) of period 27 is defined within the period 0 < 1 « 2x by 


t (0 <г<1т) 
fü) 21i Gn « t « n) 
w-3t (%<t< 2m) 


Sketch a graph of f(A) for —2m < t < 3m and find a Fourier series expansion of it. 
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Figure 7.5 Graph of f 
the function f(t) of 
Example 7.3. 
-27 -л О in л 2n àx (€ 


Solution A graph of the function f(t) for 22r « t « 3m is shown in Figure 7.5. From (7.15), 


2n n/2 т 2n 
af fà - i | a ware | (n - if)dt| - ёл 
0 0 п/2 T 


and 
1 2n 
21) f(t) cosntdt (п= 1, 2, 3,...) 
T 
0 
n/2 T 2m 
zi | конта | (n - 1f) cos nt dt 
T 
0 п/2 T 
1 n/2 T 2 : 2n 
21 Ed пі+ 995"! + Жүк + EIE _ соз 
nin п? 2п 2 п 2п 
0 n/2 T 
| T 1 1 m 1 1 ) 
= | іп упт + cos inn- — - т JNT - — + — cos nT 
T 2n п? n 2 2n? 
= -l (20 int - 3 + cos nT) 
2nn 
that is, 
1 п/2 
——[(—1) -1] (evenn) 
nn 
ü, 
= (odd n) 
тл 
From (7.16), 


ZH f(t)sinntdt (n=1,2,3,...) 


T 2n 
. 1 : 
f isis | insons. (n i0) sin neat 
n/2 


T 


n/2 T 
-7 cos п бірт F -Z cos nt 
2n 
0 n/2 


1 2т 
T eos nt- —;sin nt 
2n 2n? 


T 


xis 


a [rR 
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1 T ps T T T 
= =|-—cos int + — sin inn - —cos nt +—cos int — cos пт 


т“ 2п п 2п 2п 2п 
Lob gena 
- —3jsinjnn 

Tn 

0 (even n) 
— E (n-1)/2 

Cem (odd n) 

Tn 


Hence from (7.14) the Fourier series expansion of f(t) is 


f(t) =21 - 2( cos t+ mE * шь T ) 
T 3 5 


_ 2{ 20828 i B 4 £08 i; 2. ) 
пх 2 6 10 








3° 5° T 





+15. sin 3¢ | sin 5f oe ties | 
T 


(m A major use of the MATLAB Symbolic Math Toolbox and MAPLE, when dealing with 
Fourier series, is to avoid the tedious and frequently error prone integration involved in 
determining the coefficients a, and 5,. It is therefore advisable to use them to check 
the accuracy of integration. To illustrate we shall consider Examples 7.2 and 7.3. 

In MAPLE 7 may be declared to be an integer using the command 


assume (n,integer); 
which helps with simplification of answers. There is no comparable command in 
MATLAB so, when using the Symbolic Math Toolbox, we shall use the command 
maple(‘assume (n,integer) ’) 
Considering Example 7.2 the MATLAB commands 


syms t n 
maple(‘assume (n,integer)’); 
ОРОКЕ) Со а ср рн р 
return the value of a, as 
Дао 
(-1)" 


2 


Entering the command pretty (ans) gives a, in the form 4 , Where n- indicates 





that n 1s an integer. Likewise the commands 
ш (ЕЛ sb iE) лише] „= „л ЙӘ р 
pretty (ans) 


return b, as 





thus checking with the values given in the solution. 
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The corresponding commands in MAPLE are 


assume (n, integer); 
sme (IEMA К сос (т^) К рр Ире 
returning the value of a, as 
c E 
E 
n= 


with the further command 
ае (Е ME EES a E E E 





returning the value of 5, as 





again checking with the values given in the solution. 
In Example 7.3 we are dealing with a piecewise function, which can be specified 
using the piecewise command. In MATLAB the commands 


syms t n 
maple(‘assume (n,integer)"'); 
т = {еш СЕНШ UB Tio IE? rry То S sr [e S 








(iecore = @|„ [әле з: „эшле |] у 
пае ео оО оо 
pretty (ans 





return the value of a, as 
Sors ME oT E NEM EN EST ae 
n pi 


with the further commands 


e? 





еа АЛЕ M CO PRO Sr TS 
pretty (ans) 
returning the value of 5, as 
sin(1/2 pi n~) 
ree [р 
In MAPLE the commands 
Есау (очесет ее (ЕВРА ЕЕЕ Рата 
t<= Pi), Pi/2,ts= Pr, Pil—t/ 2); 
Баео 
assume (n, integer); 
а ааа eos (inet), ИО) Иа 
Tox E (t Morena (rng C MEC (o TO ZR 
return the same values as MATLAB above for a, and b,. 
An alternative approach to using the piecewise command is to express the 
function in terms of Heavyside functions. 


7.2.4 Even and odd functions 


Noting that a particular function possesses certain symmetrical properties enables us 
both to tell which terms are absent from a Fourier series expansion of the function and 
to simplify the expressions determining the remaining coefficients. In this section we 
consider even and odd function symmetries. 
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ft) First we review the properties of even and odd functions that are useful for deter- 
eee mining the Fourier coefficients. If Af) is an even function then At) = f-t) for all t, and 
the graph of the function is symmetrical about the vertical axis as illustrated in Figure 
7.6(a). From the definition of integration, it follows that if ft) is an even function then 


| 
a 
o 
at--- 


| f(t) dt = 2 f(t) dt 


0 


If f(f) is an odd function then f(t) = —f(—A) for all t, and the graph of the function is 
symmetrical about the origin; that is, there is opposite-quadrant symmetry, as illustrated 
in Figure 7.6(b). It follows that if f(f) is an odd function then 





| S(t) dt= 0 


Figure 7.6 Graphs of 
(a) an even function . ] . 
and (b) an odd The following properties of even and odd functions are also useful for our purposes: 


function. 


(a) the sum of two (or more) odd functions is an odd function; 

(b) the product of two even functions is an even function; 

(c) the product of two odd functions is an even function; 

(d) the product of an odd and an even function is an odd function; 
(e) the derivative of an even function is an odd function; 

(f) the derivative of an odd function 1s an even function. 


(Noting that /*'" is even and t? is odd helps one to remember (a)- (f ).) 


Using these properties, and taking d 2 — i Tin (7.11) and (7.12), we have the following: 


(i) If, f( is an even periodic function of period T then 


Т/2 T2 
а= 2l f(t) cos not dt = | f(t) cos not dt 


-T/2 0 


using property (b), and 


T 


-T/2 


T/2 
b | f(£) sin not dt — 0 
using property (d). 


Thus the Fourier series expansion of an even periodic function f(t) with period 
T consists of cosine terms only and, from (7.3), 1s given by 


АҚ) = Іа, + У a,cosnot (7.17) 
2 
n=l 
with 


T2 
Qn a Jesu m E OT) (7.18) 


0 
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(ii) If f(t) is an odd periodic function of period 7 then 


T/2 
a, = | f(t) cos not dt = 0 


-T2 
using property (d), and 
T/2 T/2 
b | f(i) sin not dt — a fi) sin nat dt 
0 


-T2 
using property (c). 


Thus the Fourier series expansion of an odd periodic function f(t) with period T 
consists of sine terms only and, from (7.3), is given by 


fi) = Yb, sinneot (7.19) 


n=1 


with 


T/2 
s-i ОО) (7.20) 


0 


Example 7.4 A periodic function f(?) with period 2x is defined within the period —x < t < m by 


|.j-1 (-1t<t<0) 
ло = 7 (0 « t « n) 


Find its Fourier series expansion. 


Figure 7.7 Square 
wave of Example 7.4. 





[2л |3л |4x 
І І I l 
-1 — —— 






Solution A sketch of the function f(A over the interval —4r < t « 4m is shown in Figure 7.7. 
Clearly f(t) 1s an odd function of t, so that its Fourier series expansion consists of sine 
terms only. Taking T = 27, that is œ = 1, in (7.19) and (7.20), the Fourier series expan- 
sion is given by 


f() - V. b, sinnt 


with 
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b, = 


a IN 


| fosa (n21,2,3,...) 


1 sinntdt = 2|.1 cos nt 
0 T n 0 


- 2a — cos nT) = 2 - (-1)"] 
пт пт 


a IN 


_ 4/пт (odd n) 
0 (even n) 


Thus the Fourier series expansion of f(f) is 


flo) = 5 (sin 1 Lin 31 Lin 5г+...) o. c (7.21) 


Example 7.5 A periodic function f(t) with period 27 is defined as 
fj=? (a<t<n), f(=f(t+ 2m) 


Obtain a Fourier series expansion for it. 


Solution A sketch of the function (f) over the interval —3n < ¢ < 3m is shown in Figure 7.8. 
Clearly, f(t) is an even function of ¢, so that its Fourier series expansion consists of 
cosine terms only. Taking 7 = 27, that is 0 — 1, in (7.17) and (7.18), the Fourier series 
expansion is given by 


= іа + у à, COS nt 
n=1 


with 


a= 2 rou- 12 = ёт 
TU TU б 


Figure 7.8 The 
function f(t) of 
Example 7.5. 
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and 


f(tcosntdt (n21,2,3,...) 


a IN 


2 
2 А s 
t cos ntdt =2 sin nt +2 cos nt- > sin nt 
Tn n n 


a, = 


| 
| 


a IN 


0 
= 2 ( Bi cos т) E S Cay 
TNN n 
since sinnt = 0 and cos nm — (-1)". Thus the Fourier series expansion of f(f) 7 f? is 


f= in +4 У ср. cos nf (7.22) 
п=1 n 


or, writing out the first few terms, 


umo low 4 
f(t) » 3i — 4cost * cos2t — 3 cos3t  . .. 


7.2.5 Linearity property 


The linearity property as applied to Fourier series may be stated in the form of the 
following theorem. 


Theorem 7.1  If/(r) — Ig(t) - mh(t), where g(t) and ^(f) are periodic functions of period T and / and m 


are arbitrary constants, then f(f) has a Fourier series expansion in which the coefficients 
are the sums of the coefficients in the Fourier series expansions of g(f) and A(t) multi- 
plied by / and m respectively. 


Proof Clearly f(t) is periodic with period T. If the Fourier series expansions of g(f) and A(f) are 


g(0-1a- x à, COS N@t + У b, sin nat 


n=1 n=1 


h() = іо + У æ, cos not + у B, sin not 


n-l п=1 


then, using (7.4) and (7.5), the Fourier coefficients in the expansion of f(t) are 


d+T а+т 
А, = l f(t) cos not dt = 2l [/g(t)  mh(t)] cosnot dt 


d d 


d+T d+T 
i af g(t) cos not dt "| h(t) cos not dt — la, + ma, 
d 


d 
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and 
2 d+T 2l ат 2 d+T 
В„= © | f()sinnot dt = = | g(t) sin nat dt + == | A(t) sin nøt dt 
T d T d T d 
= Ib, + mp, 
confirming that the Fourier series expansion of f(f) is 
/@) =; Ча, + тоу) + у (la, + ma.) cos n@t + У (Ib, + mB,,) sin not 


п=1 п=1 


end of theorem 


Example 7.6 Suppose that g(t) and A(t) are periodic functions of period 2r and are defined within the 
period ^x < t < n by 
a=, hat 


Determine the Fourier series expansions of both g(t) and A(t) and use the linearity 
property to confirm the expansion obtained in Example 7.2 for the periodic function f(t) 
defined within the period -n < г< n by (A =t +t. 


Solution The Fourier series of g(f) is given by (7.22) as 
g(t) 2ir +45 ср. cos nt 
n-l n 


Recognizing that A(t) = ¢ is an odd function of t, we find, taking 7 — 21 and w= 1 in 
(7.19) and (7.20), that its Fourier series expansion is 


h(t) 9 V b, sin nt 


n=1 


where 


aos 


= A(t)sinntdt (n=1,2,3,...) 


e T 
-2 “sin ma = e cos nt + Sau 
T n n 


Ses 
n 


recognizing again that cos nm = (—1)" and sin nz = 0. Thus the Fourier series expansion 
of h(t) = tis 


кй =-2ў CO sin nt (7.23) 
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Using the linearity property, we find, by combining (7.22) and (7.23), that the Fourier 
series expansion of f(f) = g(f) + A(A) = t + t is 


f(t- юа SD cont -2 5) sin 


n=1 


which conforms to the series obtained in Example 7.2. 


7.2.6 Exercises 


Check evaluation of the integrals using MATLAB or MAPLE whenever possible. 


In each of the following a periodic function f(t) of 
period 2m is specified over one period. In each case 
sketch a graph ofthe function for —4n « t « 4n and 


obtain a Fourier series representation of the function. 


(а) f= (-x « t « 0) 
t (0c«t«m) 

(b) = (—-т << 0) 
0 (0<t<n) 


(с) f®=1-4 (<t<2n) 
T 


0 (т=г=—-тш) 
(d) f() 242cost. (Cin & t & In) 
0 (m « t x m) 
(e) ff) 2cosit (-n«t«m) 
(f) f) 9I] Ca<t<n) 
(g) r=] е 
м-т (0<г<л) 


t0) 
t « n) 


wody (т 
tte (OS 


Obtain the Fourier series expansion of the periodic 
function f(t) of period 27 defined over the 
period 0 « t « 2n by 


А) = (п – 0? 
Use the Fourier series to show that 


i оо Е 
07 = 


(0 x t x 2n) 


S 


The charge q(t) on the plates of a capacitor at time 
tis as shown in Figure 7.9. Express q(t) as a Fourier 
series expansion. 


q(t) 
Q 


-2n -m О л 2л Зл 4л t 


Figure 7.9 Plot of the charge q(t) in Exercise 3. 


The clipped response of a half-wave rectifier is the 
periodic function f(f) of period 2n defined over 
the period 0 < t « 2x by 


fo [Ph (0 
0 (т 


Express f(t) as a Fourier series expansion. 


t ) 


T 
21) 


IN IK 


= 
St 


Show that the Fourier series representing the 
periodic function f(t), where 


(7x «t « 0) 
(0< t< T) 


fo- л 


(t- x) 


S(t + 2m) =f) 
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Use this result to show that Find the Fourier series expansion that represents the 
even function for all values of t, and use it to show that 

el o2 Sep ai 

(a) 2,570 о У E =p" 


n=1 n=1 


2 x 1 
í Е 


n-l 


Cole 


6 A periodic function /(¢) of period 21 is defined 7 A periodic function f(t) of period 27 is defined 
within the domain 0 < ¢ < x by within the period 0 < г = 2x by 
t (0<t< $n) 2-t/n (0StSn) 
fit) = l | 0 = Bae 
T-t (in <tSn7) Wm (StS 2m) 


Draw a graph of the function for -4m « t « 4л 
and obtain its Fourier series expansion. 

By replacing t by t — i T in your answer, 
(a) f(t) is an even function show that the periodic function f(t — 1x) — 3 is 


(b) f(A is an odd function represented by a sine series of odd harmonics. 


Sketch a graph of f(t) for —2n « t « 4m for the two 
cases where 


7.2.7 Functions of period T 


Although all the results have been related to periodic functions having period 7, all the 
examples we have considered so far have involved periodic functions of period 27. This 
was done primarily for ease of manipulation in determining the Fourier coefficients 
while becoming acquainted with Fourier series. As mentioned in Section 7.2.3, functions 
having unit frequency (that is, of period 27) are rarely encountered in practice, and in 
this section we consider examples of periodic functions having periods other than 27. 


Example 7.7 A periodic function f(t) of period 4 (that is, f(t + 4) = f(t) is defined in the range 


-2«t«2by 
дә={ (-2<t<0) 
1 (0<г<2) 


Sketch a graph of f(t) for -6 « t « 6 and obtain a Fourier series expansion for the 
function. 


Figure 7.10 
The function f(t) 
of Example 7.7. 





Solution А graph of f(t) for —6 < t « 6 is shown in Figure 7.10. Taking T= 4 in (7.4) and (7.5), 
we have 


zi дош =} | ба [| 
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ZZ f(t) cos 1 inntdí (n21,2,3,...) 
-{[ oars | eos unrar |= 0 


fo sin Innt dt (n21,2,3,...) 
A oars f sin parar |= E0 - eosam) = 1-6-0 n 
пт пт 


= A E cos nx) - —-[1 -(-1)"] 
пт пт 


-{ 0 (even n) 
2/nm (odd n) 


Thus, from (7.3), the Fourier series expansion of f(f) is 
f) =} + Ž(sin int+ tsin ènt + tsin înt+ ) 
/а) =» 1p unm ;Sin 5 T£ - sin 5... 


1 
2n-1 





= (2 sin 2n - 1)nt 
T 4 


Nie 


М 7 


Example 7.8 A periodic function f(t) of period 2 is defined by 


31 (0<1<1 
fo- t 
3 (1«t«2) 
f(t * 2) — f(t) 
Sketch a graph of f(t) for -4 « t « 4 and determine a Fourier series expansion for the 


function. 


Figure 7.11 Ko 
The function f(t) 


of Example 7.8. 7 3 
| 1 I ! 
— | -— 
3 4 t 


-4 -3 -2 -1 O 1 2 





Solution — A graph of f(f) for —4 « t « 4 is shown in Figure 7.11. Taking T= 2 in (7.4) and (7.5), 
we have 


2 1 2 
zz sous | зан зае: 
0 0 1 
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2 
2-3) f(t) cos =F dt (n=1,2,3,...) 


0 


1 2 К 1 s 2 
t sin nnt t іп иті 
il зг евз лг + | 3 cosnnnar = Betas ennt ea 
1 


i i пт (nn) пт 
0 (еуеп л) 


—6/(nn) (odd n) 





zoo ;( cosnn - n=] 
T) 


2 
„= f(t) sin di (n= 1, 2,3,...) 


1 2 
-| Ај 3 sin nnt dt 


1 


: 1 2 
_ | З соѕллі, З ѕп mu , .3cosnnt | | 3 e EJ 
пт (пт) пт пт пт 


Thus, from (7.3), the Fourier series expansion of f(f) is 


6 1 1 
Қ?) = 5- —(соѕ лі + 5 соѕ Злі + 05 577+...) 
т 


- J (sin nt $sin 274+ tsin 374+. ы) 
n 


29 6x чл ш тшш 
: rà (2n - 1y, 12. n 





Example 7.9 Obtain the Fourier series expansion of the rectified sine wave 


f) — |sint| 


Solution А sketch of the wave over the interval ^x « f « 2m is shown in Figure 7.12. Clearly, 
f(t) is periodic with period x. Taking T = m, that is, € — 2, in (7.3)-(7.5) the Fourier 
series expansion is given by 


S(t) = bag + x a,cos 2nt 


п=1 


ао = jj sint dt = = 
т), T 


Figure 7.12 Rectified 0, 
wave f(t) =|sin¢|. 
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sin f cos 2nt dt 


[sin(2n + 1)t — sin(2n — 1)r] df 


— l| cos2(n-* 1)t , cos2(n - I)r i 
T 2n+1 2n-1 


0 


i ——-——)-(-—— .——)--3-1 
T|\2n+1 2n-l 2nt+1 2n-1 лди? - 1 


Thus the Fourier series expansion of f(f) is 




















А) = а у 1 cos 2nt 


т T 4п°-1 


п=1 


or, writing out the first few terms, 


f(t- 2 - 3 cos 21+ cos 4t 3008 61+...) 


7.2.8 Ехегсіѕеѕ 


8  Finda Fourier series expansion of the periodic shown in Figure 7.13. Determine a Fourier series 
function expansion of the rectified wave. 
ft)-t CIlxt«l) fo 
f(t + 21) =f A 
9 A periodic function f(A) of period 2/ is defined over О 
one period by -2no -n/w To  2no  3mo' 


K Figure 7.13 Rectified sine wave of Exercise 11. 
"Ue (rey «9 


fut) = T 


K Obtain a Fourier series expansion of the periodic 
Wg quede 


function 
Determine its Fourier series expansion and illustrate ft) CT«t«T) 
graphically for —3/ « t « 3l. ft + 2T) =f) 
10 A periodic function of period 10 is defined within and illustrate graphically for 237 « t < 3T. 


lod -5 <t< : : . 9 
c ML 13 Determine a Fourier series representation of the 


fit) = E (-5 «€ t « 0) periodic voltage e(t) shown in Figure 7.14. 
` 3 (0<t<5) 


е@) 


Determine its Fourier series expansion and illustrate 
graphically for —12 < t « 12. 





11 Passing a sinusoidal voltage A sin wt through a 
half-wave rectifier produces the clipped sine wave Figure 7.14 Voltage e(t) of Exercise 13. 
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7.2.9 


Theorem 7.2 


Convergence of the Fourier series 


So far we have concentrated our attention on determining the Fourier series expan- 
sion corresponding to a given periodic function f(t). In reality, this is an exercise in 
integration, since we merely have to compute the coefficients a, and b, using Euler’s 
formulae (7.4) and (7.5) and then substitute these values into (7.3). We have not yet 
considered the question of whether or not the Fourier series thus obtained is a valid 
representation of the periodic function f(t). It should not be assumed that the existence 
of the coefficients a, and b, in itself implies that the associated series converges to the 
function f(t). 

A full discussion of the convergence of a Fourier series 1s beyond the scope of 
this book and we shall confine ourselves to simply stating a set of conditions which 
ensures that f(t) has a convergent Fourier series expansion. These conditions, known as 
Dirichlet's conditions, may be stated in the form of Theorem 7.2. 


Dirichlet's conditions 

If f(t) is a bounded periodic function that in any period has 
(a) a finite number of isolated maxima and minima, and 
(b) a finite number of points of finite discontinuity 


then the Fourier series expansion of f(¢) converges to f(t) at all points where f(t) is 
continuous and to the average of the right- and left-hand limits of f(f) at points where 


ft) 1s discontinuous (that is, to the mean of the discontinuity). 


Example 7.10 


Solution 


end of theorem 


Give reasons why the functions 
1 . 1 
= b x 
G) 3L — 0 sm( 


do not satisfy Dirichlet's conditions in the interval 0 « t « 2л. 


(a) The function f(f) 2 1/(3 — f) has an infinite discontinuity at t = 3, which 1s within 
the interval, and therefore does not satisfy the condition that f(?) must only have 
finite discontinuities within a period (that is, it is bounded). 


(b) The function f(?) = sin[1/(t — 2)] has an infinite number of maxima and minima 
in the neighbourbood of t = 2, which is within the interval, and therefore does not 
satisfy the requirement that f(t) must have only a finite number of isolated 
maxima and minima within one period. 


The conditions of Theorem 7.2 are sufficient to ensure that a representative Fourier 
series expansion of f(f) exists. However, they are not necessary conditions for convergence, 
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and it does not follow that a representative Fourier series does not exist if they are not 
satisfied. Indeed, necessary conditions on f(t) for the existence of a convergent Fourier 
series are not yet known. In practice, this does not cause any problems, since for almost 
all conceivable practical applications the functions that are encountered satisfy the 
conditions of Theorem 7.2 and therefore have representative Fourier series. 

Another issue of importance in practical applications is the rate of convergence of 
a Fourier series, since this is an indication of how many terms must be taken in the 
expansion in order to obtain a realistic approximation to the function (f) it represents. 
Obviously, this is determined by the coefficients a, and b, of the Fourier series and the 
manner in which these decrease as n increases. 

In an example, such as Example 7.1, in which the function /(f) is only piecewise- 
continuous, exhibiting jump discontinuities, the Fourier coefficients decrease as 1/n, 
and it may be necessary to include a large number of terms to obtain an adequate 
approximation to /(ї). In an example, such as Example 7.3, in which the function 
is a continuous function but has discontinuous first derivatives (owing to the sharp 
corners), the Fourier coefficients decrease as 1/n*, and so one would expect the series 
to converge more rapidly. Indeed, this argument applies in general, and we may 
summarize as follows: 


(a) 1f f(f) 1s only piecewise-continuous then the coefficients in its Fourier series 
representation decrease as 1/n; 


(b) 1ff(f) is continuous everywhere but has discontinuous first derivatives then the 
coefficients in its Fourier series representation decrease as 1/n?; 


(c) 1f (ft) and all its derivatives up to that of the rth order are continuous but the 
(r + 1)th derivative is discontinuous then the coefficients in its Fourier series 
representation decrease as 1/n"?. 


These observations are not surprising, since they simply tell us that the smoother the 
function f(t), the more rapidly will its Fourier series representation converge. 

To illustrate some of these issues related to convergence we return to Example 7.4, 
in which the Fourier series (7.21) was obtained as a representation of the square wave 
of Figure 7.7. 

Since (7.21) is an infinite series, it is clearly not possible to plot a graph of the result. 
However, by considering finite partial sums, it is possible to plot graphs of approxima- 
tions to the series. Denoting the sum of the first N terms in the infinite series by fy (4), 
that is 


4v sin(2n - 1) 
М0 = 2, 271 (7.24) 


the graphs of f(t) for N = 1, 2, 3 and 20 are as shown in Figure 7.15. It can be seen 
that at points where f(t) is continuous the approximation of f(t) by f(t) improves as 
N increases, confirming that the series converges to f(f) at all such points. It can also 
be seen that at points of discontinuity of f(f), which occur at t=+nm (n=0, 1, 2,...), 
the series converges to the mean value of the discontinuity, which in this particular 
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fio 











Figure 7.15 Plots of f,(f) for a square wave: (a) N — 1; (b) 2; (c) 3; (d) 20. 


example is ‘Cl + 1) =0. As a consequence, the equality sign in (7.21) needs to be 
interpreted carefully. Although such use may be acceptable, in the sense that the series 
converges to f(t) for values of t where f(f) is continuous, this is not so at points of 
discontinuity. To overcome this problem, the symbol ~ (read as ‘behaves as’ or ‘repre- 
sented by’) rather than = is frequently used in the Fourier series representation of a 
function f(t), so that (7.21) is often written as 


fy~2y sin(2n - 1)t 
T 2n-1 


In Section 7.7.3 it is shown that the Fourier series converges to f(t) in the sense that the 
integral of the square of the difference between f(t) and (£) is minimized and tends to 
zero as N — oo. 

We note that convergence of the Fourier series is slowest near a point of discontinu- 
ity, such as the one that occurs at t= 0. Although the series does converge to the mean 
value of the discontinuity (namely zero) at t= 0, there is, as indicated in Figure 7.15(d), 
an undershoot at t = 0° (that is, just to the left of / 2 0) and an overshoot at t = 0* (that 
is, just to the right of f= 0). This non-smooth convergence of the Fourier series leading 
to the occurrence of an undershoot and an overshoot at points of discontinuity of f(t) is 
a characteristic of all Fourier series representing discontinuous functions, not only that 
of the square wave of Example 7.4, and is known as Gibbs’ phenomenon after the 
American physicist J. W. Gibbs (1839—1903). Тһе magnitude of the undershoot/over- 
shoot does not diminish as N — © in (7.24), but simply gets ‘sharper’ and ‘sharper’, 
tending to a spike. In general, the magnitude of the undershoot and overshoot together 
amount to about 18% of the magnitude of the discontinuity (that is, the difference in the 
values of the function /(/) to the left and right of the discontinuity). It is important that 
the existence of this phenomenon be recognized, since in certain practical applications 
these spikes at discontinuities have to be suppressed by using appropriate smoothing 
factors. 


1:3 
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To reproduce the plots of Figure 7.15 and see how the series converges as M 
increases use the following MATLAB commands: 


ТОСЕ LOO ШО ОДЕ 

р 

П РРО a O O jou Bion Bon MEC STO ETE 

Е ЛИЕ ТТ ПЕТТЕ ТЕЕ: 

for m=ils20 

G=£—4/pon* sim ((2*n—1)*t) / (24*m—1) ; 

Foe item (EL E CE IE ro M rm M RECO CONT NM D CO TP TEE TC е 
axis([-3*pi,3*pi,-inf,inf]),pause 

end 


The pause command has been included to give you an opportunity to view the 
plots at the end of each step. Press any key to proceed. 


Theoretically, we can use the series (7.21) to obtain an approximation to m. This is 
achieved by taking t= 5 7, when f(t) = 1; (7.21) then gives 


- sini(2n- 1)л 


EX 2n-1 


leading to 


n- 


For practical purposes, however, this is not a good way of obtaining an approximation 
to m, because of the slow rate of convergence of the series. 


Functions defined over a finite interval 


One of the requirements of Fourier's theorem is that the function to be expanded be 
periodic. Therefore a function f(t) that is not periodic cannot have a Fourier series 
representation that converges to it for all values of t. However, we can obtain a Fourier 
series expansion that represents a non-periodic function f(t) that is defined only over 
a finite time interval 0 « ft « Tt. This is a facility that is frequently used to solve 
problems in practice, particularly boundary-value problems involving partial dif- 
ferential equations, such as the consideration of heat flow along a bar or the vibrations 
of a string. Various forms of Fourier series representations of f(f), valid only in the 
interval 0 < f « T, are possible, including series consisting of cosine terms only or 
series consisting of sine terms only. To obtain these, various periodic extensions of f (f) 
are formulated. 


Full-range series 


Suppose the given function f(t) is defined only over the finite time interval 0 <S t « т. 
Then, to obtain a full-range Fourier series representation of f(f) (that is, a series 
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consisting of both cosine and sine terms), we define the periodic extension ó(r) of 


S(t) by 


Figure 7.16 Graphs of 
a function defined only 
over (a) a finite interval 
0 <tS Tand (b) its 
periodic extension. 


Example 7.11 


Solution 


Figure 7.17 
The functions f(t) and 
o(t) of Example 7.11. 


O=O O<t<n 
ot + т) = ф(0) 


The graphs of a possible f(f) and its periodic extension $ (f) are shown in Figures 7.16(a) 
and (b) respectively. 

Provided that /(ї) satisfies Dirichlet's conditions in the interval 0 « f « т, the 
new function $(f), of period 7, will have a convergent Fourier series expansion. 
Since, within the particular period 0 < f « 7, (f) 1s identical with f(A), it follows 
that this Fourier series expansion of $(f) will be representative of f(t) within this 
interval. 





Find a full-range Fourier series expansion of f(¢) = t valid in the finite interval 0 « t « 4. 
Draw graphs of both f(t) and the periodic function represented by the Fourier series 
obtained. 


Define the periodic function @(f) by 
000 = 00) =1 O<t<A4) 
o(t+ 4) = ф(0) 


Then the graphs of f(t) and its periodic extension (£) are as shown in Figures 7.17(a) 
and (b) respectively. Since ó(f) is a periodic function with period 4, it has a convergent 
Fourier series expansion. Taking T = 4 in (7.4) and (7.5), the Fourier coefficients are 
determined as 


4 4 
si лда) tdt —-4 


0 


0 
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4 
а, = | f(t) cosinntdt (п= 1, 2,3,...) 


0 








4 4 

zl t cosinntdt - 1 21 sin bag, cosinnt| =0 

2 2 
б пт (пп) 0 
апа 
4 
b | f(t) sninntdt (n=1,2,3,...) 

0 
$ 21 4 E 

= | t sin Innt dt — 1| -—— cos innt sin tnnt] =-— 

» 2 

б пт (nt) ó nn 


Thus, by (7.3), the Fourier series expansion of ф(ї) is 


4 sal pus T2343 li drm 
О (meet вш тї+ SUD AeA simn AtA sin tte) 


оо 


ааай 
= 2 = У, = віп зтлі 


п=1 


Since @(f) = f(t) for 0 < < 4, it follows that this Fourier series is representative of f(t) 
within this interval, so that 


f)-:22-Sy lininu (0«1:«4) (7.25) 
Ton 

It is important to appreciate that this series converges to ¢ only within the interval 

0 « t « 4. For values of ¢ outside this interval it converges to the periodic extended 

function $(f). Again convergence is to be interpreted in the sense of Theorem 7.2, so 

that at the end points ¢ = 0 and t= 4 the series does not converge to t but to the mean 

of the discontinuity in @(f), namely the value 2. 


Half-range cosine and sine series 


Rather than develop the periodic extension $(7) of (f) as in Section 7.3.1, it is possible 
to formulate periodic extensions that are either even or odd functions, so that the result- 
ing Fourier series of the extended periodic functions consist either of cosine terms only 
or sine terms only. 

For a function f(t) defined only over the finite interval 0 « f < T its even periodic 
extension 7(/) is the even periodic function defined by 


ft) (0<t<7) 


F(t) =| 
Д-р) (-t<1t<0) 


F(t * 21) — f(t) 
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Figure 7.18 

(a) A function f(A); 
(b) its even periodic 
extension F(t). 


Figure 7.19 

(a) A function f(t); 
(b) its odd periodic 
extension G (t). 


As an illustration, the even periodic extension F(t) of the function f(t) shown in 
Figure 7.16(a) (redrawn in Figure 4.18(a)) is shown in Figure 4.18(b). 


f(t) F(t) 





Provided that f(f) satisfies Dirichlet’s conditions in the interval 0 < t < T, since it is 
an even function of period 27, it follows from Section 7.2.4 that the even periodic 
extension F(f) will have a convergent Fourier series representation consisting of cosine 
terms only and given by 


TOR TE > a, COS = (1.26) 
where 
T 
«i| ftt) cos t ay ОО) (7-21) 
0 


Since, within the particular interval 0 < t < c, F(f) is identical with f(1), it follows that 
the series (7.26) also converges to f(t) within this interval. 

For a function f(t) defined only over the finite interval 0 < ¢ S 7, its odd periodic 
extension G(/) is the odd periodic function defined by 


fu) (O<t<7) 


G(f) -] 
-K-t cere g) 


G(t 4- 21) - GÒ 


Again, as an illustration, the odd periodic extension G(f) of the function f(r) shown in 
Figure 7.16(a) (redrawn in Figure 7.19(a)) is shown in Figure 7.19(b). 


fO 


О Tod 





(a) (b) 


Provided that f(f) satisfies Dirichlet's conditions in the interval 0 « f < T, since it is 
an odd function of period 27, it follows from Section 7.2.4 that the odd periodic exten- 
sion G(f) will have a convergent Fourier series representation consisting of sine terms 
only and given by 


@(@) = Y m м (7.28) 
m= 


Example 7.12 


Solution 
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where 


| До) sin "t ay ООО? з О; (7.29) 


0 


Again, since, within the particular interval 0 < t < 1, G(f) is identical with f(A, it 
follows that the series (7.28) also converges to f(t) within this interval. 

We note that both the even and odd periodic extensions F(t) and G(f) are of period 
27, which is twice the length of the interval over which f(t) is defined. However, the 
resulting Fourier series (7.26) and (7.28) are based only on the function f(t), and for this 
reason are called the half-range Fourier series expansions of f(t). In particular, the 
even half-range expansion F(t), (7.26), is called the half-range cosine series expan- 
sion of f(f), while the odd half-range expansion G(f), (7.28), is called the half-range 
sine series expansion of /(f). 


For the function f(t) 2 t defined only in the interval 0 < *£ « 4, and considered in 
Example 7.11, obtain 


(a) a half-range cosine series expansion 


(b) a half-range sine series expansion. 


Draw graphs of f(t) and of the periodic functions represented by the two series obtained 
for —20 « t « 20. 


(a) Half-range cosine series. Define the periodic function F(t) by 


Fo =| f=t (0«t«4) 
f(-t)2-t (-4<t<0) 
F(t+ 8) = F(t) 


Then, since F(t) is an even periodic function with period 8, it has a convergent 
Fourier series expansion given by (7.26). Taking tT = 4 in (7.27), we have 


4 4 
«i лоа =) 111 = 4 


0 0 


4 
si f(t) cosinntdt (n=1,2,3,...) 


0 





| 4t 16 : 
- | t cos inxtdt —- 1 E sin innt * cos innt 


2 
j пт (nny ; 


0 (even n) 


-l6/(nn) (odd n) 





Ec ае 
T) 
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(b) 


Then, by (7.26), the Fourier series expansion of F(f) is 
F(t)=2- 16 cos intt+4tcosint++tcosint+...) 
т? 4 3 4 E 4 
or 
16 х- 1 


Ра) =2- POE 
п? 6 (2п-1) 


Since F(t) = f(A) for 0 « t < 4, it follows that this Fourier series is representative 
of f(t) within this interval. Thus the half-range cosine series expansion of f(f) is 


; cos }(2n - 1) nt 


foü-2t-2 cm = cos 2л - 1)лг (0<1<4) (7.30) 
T (2n-1) 


Half-range sine series. Define the periodic function G(f) by 


aw =| fist O=724) 
—-f(-t2st (-4<t<0) 
G(t+ 8) = G(f) 


Then, since G(f) is an odd periodic function with period 8, it has a convergent 
Fourier series expansion given by (7.28). Taking t= 4 in (7.29), we have 


4 
b, = | Дзїл1ллїйї (n=1,2,3,...) 


0 





4 
= | t sin innt dt =} Е cos innt 16 sin inf 
2 4 
пт (л 


0 0 


=-Ż cosan = Sey 

пт пт 
Thus, by (7.28), the Fourier series expansion of G(f) 1s 
G(t) = S sin Hut - Isin Int  sin?nt- ...) 
Or 

оо п+1 

G(t) - 9 y, CD sin innt 

T n=1 n 


Since G(f) — f(t) for 0 < t < 4, it follows that this Fourier series is repres- 
entative of f(f) within this interval. Thus the half-range sine series expansion of 


S(O is 


оо nl 
Дю =г=$ Y (D sintnnt 0<t<4) (7.31) 
To n ш 
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Graphs of the given function f(t) and of the even and odd periodic expansions 
F(t) and G(f) are given in Figures 7.20(a), (b) and (c) respectively. 


Figure 7.20 Қ) 


The functions f(t), 
F(t) and G(t) of 
Example 7.12. 


14 


il 





It is important to realize that the three different Fourier series representations 
(7.25), (7.30) and (7.31) are representative of the function f(f) = t only within the 
defined interval 0 < t < 4. Outside this interval the three Fourier series converge 
to the three different functions $(t), F(f) and G(f), illustrated in Figures 7.17(b), 


7.20(b) and 7.20(c) respectively. 


7.3.3 Exercises 


Show that the half-range Fourier sine series 16 
expansion of the function f(4) = 1, valid for 
0<t<Z,is 


Aw sin(2n - l)t 
х= 22, a Tra 


Sketch the graphs of both f(f) and the periodic 
function represented by the series expansion 
for -3n < t < 3m. 


Determine the half-range cosine series expansion of iz 
the function f(t) 2 21 — 1, valid for 0 « t « I. Sketch 

the graphs of both f(t) and the periodic function 
represented by the series expansion for 

-2«t«2. 


The function f(t) = 1 — ?? is to be represented by 
a Fourier series expansion over the finite interval 
0 « t « 1. Obtain a suitable 


(a) full-range series expansion, 
(b) half-range sine series expansion, 
(c) half-range cosine series expansion. 


Draw graphs of f(t) and of the periodic functions 
represented by each of the three series for 
-4<t<4, 


A function f(t) is defined by 
АХ) = п - Р? 


and is to be represented by either a half-range 
Fourier sine series or a half-range Fourier cosine 


(0 x tx mn) 


594 FOURIER SERIES 


series. Find both of these series and sketch the 20 
graphs of the functions represented by them for 
—2m « t « 2m. 


18  Atightly stretched flexible uniform string has its 
ends fixed at the points x = 0 and x =/. The midpoint 
of the string is displaced a distance a, as shown in 
Figure 7.21. If f(x) denotes the displaced profile of 
the string, express f(x) as a Fourier series expansion 
consisting only of sine terms. 





2l 
fo) 
a 
> 
О 1 _ 1 x 
2 
Figure 7.21 Displaced string of Exercise 18. 22 
19 Repeat Exercise 18 for the case where the displaced 
profile of the string is as shown in Figure 7.22. 
fœ) 
1 
4! 
4 
28 





Figure 7.22 Displaced string of Exercise 19. 


/.4 


A function f(f) is defined on 0 « t « n by 


=|" (0=г< in) 


0 (т<г<л) 


Find a half-range Fourier series expansion 
of f(t) on this interval. Sketch a graph of 
the function represented by the series for 
-2m x t « 2m. 


A function f(f) is defined on the interval 
-l < x < l by 


fo -füx| - 


Obtain a Fourier series expansion of f(x) and sketch 
a graph of the function represented by the series for 
—3l € x « 3l. 


The temperature distribution 7(x) at a distance x, 
measured from one end, along a bar of length 
L is given by 


T(x) -Kx(L-x) (0xxL), Kc-constant 


Express 7(x) as a Fourier series expansion 
consisting of sine terms only. 


Find the Fourier series expansion of the function 
S(t) valid for -1 < t < 1, where 


1 (-1«t«90) 
(0<t<1) 


w- 


cos Tt 


To what value does this series converge when 
t=1? 


Differentiation and integration of Fourier series 


It is inevitable that the desire to obtain the derivative or the integral of a Fourier series 
will arise in some applications. Since the smoothing effects of the integration pro- 
cess tend to eliminate discontinuities, whereas the process of differentiation has the 
opposite effect, it is not surprising that the integration of a Fourier series is more likely 
to be possible than its differentiation. We shall not pursue the theory in depth here; 
rather we shall state, without proof, two theorems concerned with the term-by-term 
integration and differentiation of Fourier series, and make some observations on their 


use. 


7.4.1 


Theorem 7.3 


Example 7.13 
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Integration of a Fourier series 


A Fourier series expansion of a periodic function f(f) that satisfies Dirichlet's con- 
ditions may be integrated term by term, and the integrated series converges to the 
integral of the function f(A). 


end of theorem 


According to this theorem, if f(t) satisfies Dirichlet’s conditions in the interval 
-T <S t < m and has a Fourier series expansion 


S(t) = 340+ ¥ (a, cos nt * b, sinnt) 


n-l 


then for -n S 4 <tS T 


[wa] 223 (a, cos nt b, sin nt) dt 


ti t п=1 ti 


— |b ас ; 
= тао(ї -t,)+ у Ë (cos nt; - cos nt) + F (sin nt - sin nfj) 
n-l 


Because of the presence of the term } apt on the right-hand side, this is clearly not a 
Fourier series expansion of the integral on the left-hand side. However, the result can 
be rearranged to be a Fourier series expansion of the function 


a= | S(t) dt — 5 aot 


t 


Example 7.13 serves to illustrate this process. Note also that the Fourier coefficients in 
the new Fourier series are —b,,/n and a,,/n, so, from the observations made in Section 7.2.9, 
the integrated series converges faster than the original series for ft). If the given function 
f(t) is piecewise-continuous, rather than continuous, over the interval -n < t < m then 
care must be taken to ensure that the integration process is carried out properly over the 
various subintervals. Again, Example 7.14 serves to illustrate this point. 


From Example 7.5, the Fourier series expansion of the function 
ft-2)0 (-т=г=т),  f(t-2n)-f(n) 

is 
f=in +45 Cl cosnt (7n x t « n) 


n 


n-l 


Integrating this result between the limits — and ¢ gives 


t t t n 
| Par = | warsa | (210 созт д, 
-n -n n-l n 


-n 
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Example 7.14 


Solution 


that is, 
3 2 ч (—1 )" sin nt 
1 1 
gf = 40 t+4 у 7 (7n € t x m) 
n=1 n 


Because of the term int on the right-hand side, this is clearly not a Fourier series 
expansion. However, rearranging, we have 


ё-т! = ру CD sin ni 
n-l n 


and now the right-hand side may be taken to be the Fourier series expansion of the 
function 


280) =? -пі (-т<г<дл) 


g(t * 2n) — g(t) 


Integrate term by term the Fourier series expansion obtained in Example 7.4 for the 
square wave 


дә = {7 («л € t « 0) 
1 (0<г<т) 


f(t + 20) =f 


illustrated in Figure 7.7. 


From (7.21), the Fourier series expansion for f(f) is 


_ 4sin(2n-1)t 
A0-2- 
т 2п-1 


We now need to integrate between the limits -n and ¢ and, owing to the discontinuity 
in f(f) at t 2 0, we must consider separately values of ¢ in the intervals -n < t < 0 апа 
0crt«m. 


Case (i), interval ^x < t < 0. Integrating (7.21) term by term, we have 


кка А | siniZa-ly 
| ЕЭ у" 


п=1 


that is, 


(2n - 1 


-n 


TES: E 


(2n - 1 


n-l 


о 1 | 


7.4.2 


Theorem 7.4 
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It can be shown that 


оо 


X215 m 


(see Exercise 6), so that the above simplifies to 


>) gw (x « t « 0) (7.32) 


n-l 


Case (ii), interval 0 « t « m. Integrating (7.21) term by term, we have 


0 t " Ф ; 
(-1)dt+ 1d iy sin(2n - 1)t q, 
-r 0 T к (2п-1) 


п=1 
giving 


t=in-4y 90801-11 Grn) (7.33) 
n4 (2n-1) 


n-l 


Taking (7.32) and (7.33) together, we find that the function 


-t (-n«t«0) 
t (0<t<n) 


so-in-l 


g(t + 27) = g(t) 


has a Fourier series expansion 


g(t) =|t|=4n-4 сезип 1) 
Qn-1) 


п=1 


Differentiation of a Fourier series 


If f(t) is a periodic function that satisfies Dirichlet’s conditions then its derivative f’(1), 
wherever it exists, may be found by term-by-term differentiation of the Fourier series 
of f(t) if and only if the function f(t) is continuous everywhere and the function / (Тї) һаѕ 
a Fourier series expansion (that is, f (£) satisfies Dirichlet’s conditions). 


end of theorem 


It follows from Theorem 7.4 that if the Fourier series expansion of f(f) is differenti- 
able term by term then f(f) must be periodic at the end points of a period (owing to the 
condition that f (t) must be continuous everywhere). Thus, for example, if we are deal- 
ing with a function f(t) of period 27 and defined in the range -r < t < m then we must 
have f(—1) = f(z). To illustrate this point, consider the Fourier series expansion of 
the function 


598 FOURIER SERIES 


Example 7.15 


Solution 


f(tt (-n«t«m) 

f(t + 20) =fO 
which, from Example 7.7, is given by 

f(t) = 2(sint — 1sin2t * 1sin3t — įsin4t +...) 
Differentiating term by term, we have 

f(t) = 2(cost — cos 2t + cos 3t — cos 4t +... ) 


If this differentiation process is valid then f'(r) must be equal to unity for-n <t <1. 
Clearly this is not the case, since the series on the right-hand side does not converge 
for any value of t. This follows since the nth term of the series is 2(—1)"*' cos nt and 
does not tend to zero as n > co. 


If f(A is continuous everywhere and has a Fourier series expansion 


АХ) = іа + D (a, cos nt 4- b, sin nt) 


n=1 


then, from Theorem 7.4, provided that f’(f) satisfies the required conditions, its Fourier 
series expansion is 


TOS yy (nb, cos nt — na, sin nt) 


n=1 


In this case the Fourier coefficients of the derived expansion are nb, and na,, so, in 
contrast to the integrated series, the derived series will converge more slowly than the 
original series expansion for f(f). 


Consider the process of differentiating term by term the Fourier series expansion of the 
function 


ft? (^rztzm) f(t + 2n) — f(t) 


From Example 7.5, the Fourier series expansion of f(1) is 
f= тау СО совт (n € tm) 
n=1 n 


Since f(t) 1s continuous within and at the end points of the interval -n < t < m, we may 
apply Theorem 7.4 to obtain 


= (-1)"*' sin nt 
Lc C1) sinni —-т =/<тл 
>, - ( ) 


which conforms with the Fourier series expansion obtained for the function 


fnt (-n«t«m) f(t * 21) — f(t) 
in Example 7.7. 


7.4.3 


Figure7.23 Piecewise 
polynomial periodic 
function exhibiting 
jump discontinuities. 
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Coefficients in terms of jumps at discontinuities 


For periodic functions that, within a period, are piecewise polynomials and exhibit jump 
discontinuities, the Fourier coefficients may be determined in terms of the magnitude of 
the jumps and those of derived functions. This method is useful for determining describ- 
ing functions (see Section 7.8) for nonlinear characteristics in control engineering, 
where only the fundamental component of the Fourier series is important; this applies 
particularly to the case of multivalued nonlinearities. 

Consider a periodic function f(t), of period Т, having within the time interval 
-iT fm IT a finite number (m + 1) of jump discontinuities dp, d,,..., d, 
at times to, fj, . . . £,, with f, — į T and ¢,, = į T. Furthermore, within the interval 
ti» <t<t,(s=1,2,...,m) let f(t) be represented by polynomial functions P,(f) 
(s=1,2,...,m), as illustrated in Figure 7.23. If f (f) is to be represented in terms of 
the Fourier series 


Р) = lay У a, COS not + у b, sin not 


n=1 n-l 
then, from (7.4), 


m 


2 
a, = т> 


t 
| P(t) cos nat dt 
PX 


5 





Defining the magnitude of the jump discontinuities as in Section 5.5.11, namely 


d; =f(t; +0) -f(t 5 0) 


and noting that t = —4 T and f,, — 1 T, integration by parts and summation gives 


m ts 
a, = -1. a sin not, + | P(t) sin ron (7.34) 
s=l 


bey 


where РФ(ї) denotes the piecewise components of the derivative f‘(f) = f(A) in the 
generalized sense of (5.59). 
In a similar manner the integral terms of (7.34) may be expressed as 


1, 1 

т s . 1 т 5 

У | P?smnmordr- — V |d'?cos not | | P? (1) соз погйг 
j na 
8—1 


Sel, 5=1 bo 
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where d? (s 2 1, 2, ... , m) denotes the magnitude of the jump discontinuities in the 
derivative f (f). 

Continuing in this fashion, integrals involving higher derivatives may be obtained. 
However, since all P.(f) (s 2 1, 2, ... , 7) are polynomials, a stage is reached when all 
the integrals vanish. If the degree of P,(f) is less than or equal to N fors=1,2,...,m 
then 


m N 
Y, CU" Qo) ?[a6? sin not, - (na) 00" cos not] 


з=1 r=0 


d 
пт 


a, = 


(n#0) (7.35) 


where d? denotes the magnitudes of the jump discontinuities in the rth derivative of 
f(t) according to (5.59). 
Similarly, it may be shown that 


m N 
b, = L >, >, (1) (00) [40 cos not, — (noy dC"? sin not,] (7.36) 
and the coefficient a, is found by direct integration of the corresponding Euler formula 
2 T/2 
ay = 2 S(t) dt (7.37) 


-T/2 


Example 7.16 ^ Using (7.35)- (7.37), obtain the Fourier series expansion of the periodic function f(A) 
defined by 


fait (x « t« 0) 
` -2 (0«t«m) 


f+ 2m) =f) 


Solution In this case N — 2, and the graphs of f(t) together with those of its first two derivatives 
are shown in Figure 7.24. 


РОО) 


FOWD 
p 





Figure 7.24 (0), 7000), f? (t) of Example 7.16. 
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Jump discontinuities occur at ¢ = —1, 0 and 7, so that m = 2. The piecewise poly- 
nomials involved and the corresponding jump discontinuities are 


(a) Р() =, Р) = -2 
d, --2, а = п? +2 


(0 Р =22 Р) =0 
d) -0 dP=-2n 


() PQ9()-22, P$Xn)z0 
d® =-2 d?=2 


with d? 2 d? 2 0 forr > 2. 
Taking @= 1 (since T = 27) in (7.35) gives 


2 2 2 

: 1 1 - 

а„= 2. > d, sin nt,- = У d'? cos nt, = x d? sin nt, 
nm 5=1 И 5-1 5=1 

Since ¢, = 0, t, = 7, sin 0 = sin nt = 0, cos 0 = 1 and соѕ ил = (-1)", we have 


a,=2(-1)" (n=1,2,3,...) 
n 


Likewise, from (7.36), 


2 2 2 
_ 1 1 (D s 1 (2) 
nr Lam 1 sin nt,- 5 V d; cos nt, 
пт 5=1 п 5=1 п 5=1 


-Lf araen- hren] 

пт i 

-L(2-3n-evreeew| (n=1,2,3,...) 
пт (“и 


and, from (7.37), 


0 T 
ZI pars | caa =1n’-2 
TU im 0 


= 
= 57 2 


Thus the Fourier expansion for f(t) is 


ft) » Qr - 0 Y 2-1)" cos at 


п=1 Й 
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24 


25 


26 


7.4.4 Exercises 


Show that the periodic function 
К) =1 
Ла+2Т) = 30) 


has a Fourier series expansion 


(rere) 


f(t) = Hsin T _ 152 


T . Злі 
5 = ӨШ === 
T T T 


1 
F 3 T 


- isin м +.. J 
By term-by-term integration of this series, show 
that the periodic function 
8) = 


g(t + 2T) = ai) 


has a Fourier series expansion 


CT<t<T) 


2 
(1) = 17" - AE cos TE- 1 cos 2H! 
T 2 T 
+, оов ЭМ d cos +... 
T 4 T 


(Hint: A constant of integration must be introduced; 

it may be evaluated as the mean value over a period.) 

The periodic function 
П) = п? – 2 


A(t + 2m) = h(t) 27 


(7n € t € n) 


has a Fourier series expansion 


һ(ї) = im + 4( cos t- l cos 2f 
2 


+, соз...) 
3 


28 


By term-by-term differentiation of this series, 
confirm the series obtained for f(7) in Exercise 24 
for the case when Т = л. 


(a) Suppose that the derivative f(t) of a periodic 
function f(t) of period 27 has a Fourier series 
expansion 


f(t) = $ Ag + Y A cos nt Y B,sinnt 


п=1 п=1 


Show that 


до 1 fac) - fto] 


A, — (71) A, * nb, 
В, = —па 


п п 


where ap, a, and b, are the Fourier coefficients 
of the function f(t). 


(b) In Example 7.6 we saw that the periodic function 
АХ) = 2+1 (on«t«m) 
Д@+ 2л) = Д® 
has a Fourier series expansion 


АХ) = іл + Y 4 (-1)" cos nt 


п=1 A 


— У А (-1)' sinzt 
n 
п=1 
Differentiate this series term by term, and 
explain why it is not a Fourier expansion of the 
periodic function 


g(t) =2t+1 
g(t + 27) = g(t) 


(c) Use the results of (a) to obtain the Fourier 
series expansion of g(f) and confirm your 
solution by direct evaluation of the coefficients 
using Euler's formulae. 


(71 «€ t € n) 


Using (7.35)-(7.37), confirm the following Fourier 
series expansions: 


(a) (7.21) for the square wave of Example 7.4; 

(b) the expansion obtained in Example 7.1 for the 
sawtooth wave; 

(c) the expansion obtained for the piecewise- 
continuous function f(t) of Example 7.3. 


Consider the periodic function 


0 (=r < t < -In) 

mt+2t (-ln<t<0) 
Р) = ? i 

1-2t (0-«1-«5n) 

0 Gn «t « n) 


S(t + 2m) =f) 


(a) Sketch a graph ofthe function for — 4n « t « 4m. 
(b) Use (7.35)-(7.37) to obtain the Fourier series 
expansion 
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Қ) = in — ‘> (cos inn - 1)cos nt 
п=1 Pl 
and write out the first 10 terms of this series. 
(Note: Although the function f(t) itself has no 
jump discontinuities, the method may be used 
since the derivative does have jump 
discontinuities.) 


29 Use the method of Section 7.4.3 to obtain the 
Fourier series expansions for the following periodic 
functions: 

0 (-<t<0) 
(a) f(t) -) 5 


t (0<t<mT) 
At + 2n) 2 f(0) 


2 (-т<г<-ш) 
(6) A=; f 
-2 (n«t«m) 
f(t * 2n) = 50) 


t (0<1< 1) 
1-1 (l<t<2) 


f(t + 2)=fO 


+t (d«t«0) 


1 1 
(7-57 € t € 5n) 


(c) Л) = 


(d) fit) = 





Nie Nie 


-t (0<t#<}) 


f * 1) fe) 


7.5 Епоіпеегіпе арріісаїоп: frequency response and 


7.5.1 


oscillating systems 


Response to periodic input 


In Section 5.7 we showed that the frequency response, defined as the steady-state 
response to a sinusoidal input A sin æt, of a stable linear system having a transfer func- 
tion G(s) is given by (5.101) as 

хь(ї) = А|С( ]@)| їп [© + arg G(jo)] (7.38) 
By employing a Fourier series expansion, we can use this result to determine the 
steady-state response of a stable linear system to a non-sinusoidal periodic input. For a 
stable linear system having a transfer function G(s), let the input be a periodic function 
P(t) of period 2T (that is, one having frequency « — m/T in rad s). P(f) may be 
expressed in the form of the Fourier series expansion 


P(t) = fa) + ¥ A, sin(not + ¢,) 


n=1 


(7.39) 


where A, and @, are defined as in Section 7.2.1. The steady-state response to each term 
in the series expansion (7.39) may be obtained using (7.38). Since the system is linear, 
the principle of superposition holds, so that the steady-state response to the periodic 
input P(t) may be obtained as the sum of the steady-state responses to the individual 
sinusoids comprising the sum in (7.39). Thus the steady-state response to the input P(f) is 


х.(0) = 54000) + 5 A,|G(jn@)| sin [not + ġ, + arg G(jno)] (7.40) 


n-l 
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Example 7.17 


Figure7.25 (a)System 
and (b) input for 
Example 7.17. 


Solution 


There are two issues related to this steady-state response that are worthy of note. 


(a) For practical systems |G(j@)| —^ 0 as à — ee, so that |G( jn9)| — 0 as n — ee in 
(7.40). As a consequence, the Fourier series representation of the steady-state 
response x,(f) converges more rapidly than the Fourier series representation of 
the periodic input P(t). From a practical point of view, this is not surprising, since 
it is a consequence of the smoothing action of the system (that is, as indicated in 


Section 7.4, integration is a ‘smoothing’ operation). 


(b) There is a significant difference between the steady-state response (7.40) to a 
non-sinusoidal periodic input of frequency @ and the steady-state response (7.37) 
to a pure sinusoid at the same frequency. As indicated in (7.38), in the case of a 
sinusoidal input at frequency @ the steady-state response is also a sinusoid at the 
same frequency @. However, for a non-sinusoidal periodic input P(/) at frequency 
æ the steady-state response (7.40) is no longer at the same frequency; rather it 
comprises an infinite sum of sinusoids having frequencies næ that are integer 
multiples of the input frequency @. This clearly has important practical implica- 
tions, particularly when considering the responses of oscillating or vibrating sys- 
tems. If the frequency n@ of one of the harmonics in (7.40) is close to the natural 
oscillating frequency of an underdamped system then the phenomenon of reson- 


ance will arise. 


To someone unfamiliar with the theory, it may seem surprising that a practical 
system may resonate at a frequency much higher than that of the input. As indicated 
in Example 5.30, the phenomenon of resonance is important in practice, and it is there- 
fore important that engineers have some knowledge of the theory associated with 
Fourier series, so that the possible dominance of a system response by one of the higher 


harmonics, rather than the fundamental, may be properly interpreted. 


The mass-spring- damper system of Figure 7.25(a) is initially at rest in a position of 
equilibrium. Determine the steady-state response of the system when the mass is sub- 
jected to an externally applied periodic force P(r) having the form of the square wave 
shown in Figure 7.25(b). 





(b) 


From Newton's law, the displacement x(f) of the mass at time f 1s given by 


ах ах 


Ма +В Кх = P(t) (7.41) 
t 


dr 
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Ps) i X() SO that the system may be represented by the block diagram of Figure 7.26. Thus the 
system transfer function is 


Figure 7.26 Block Ge (7.42) 


diagram for the system Му + Вѕ+ К 


of Figure 7.26. 5 : : : 
Р From Example 7.4, the Fourier series expansion for the square wave P(f) is 


Р(г) = © E 4 81057 ,Sn(2n-l106, 
T 


5 TS 2n-1 
that is, 
P(t) 2 uy(£) * uy(£f) - u(f) - ... uf) t... (7.43) 
where 
u(t) = 40 s 0 аша Dt (7.44) 


2n-1 


Substituting the given values for M, B and K, the transfer function (7.42) becomes 


1 
G(s)  —————— 
105 + 0.55 + 250 
Thus 
1 250 - 100° .0.50 
Gjo) = —— - £9. .j- 
—10@ +0.5]@+ 250 р D 


where D = (250 — 10c?y 4- 0.250», so that 


(јо) = | - 109) 4 зш 


р? 
—M— (7.45) 
JD (250 - 100)! 0.2507] 
arg G( ja) — -tan (2397) (7.46) 
250 - 100 


Using (7.38), the steady-state response of the system to the nth harmonic u,(f) given by 
(7.44) is 


40 — 
n(2n - 1) 


where |G(jq@)| and arg G(j@) are given by (7.45) and (7.46) respectively. The steady- 
state response x,,(¢) of the system to the square-wave input P(f) is then determined as 
the sum of the steady-state responses due to the individual harmonics in (7.43); that is, 


х) = |G(j(2n — 1))| sin[(2n — 1) * arg GC j(2n — 1))] (7.47) 


x.) 9 V, xo (748) 


where x,, (t) is given by (7.47). 
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Figure 7.27 
Steady-state response 
of system of 
Figure 7.25. 





Evaluating the first few terms of the response (7.48), we have 


Жы) = a0 1 == sin t= a (25) 
T. /([(250 - 10)? - 0.25] 240 


— 0.053 sin(t - 0.003) 


Xs2(t) = ne sin | 3t- an (15) 
3T ([(250 - 90)? - 2.25] 160 


— 0.027 sin(3t - 0.009) 


40 1 : a 
) = =- t-t = 
X3 (t) 5n (645) sin s an 0 | 


= 1.02 sin(5t - In) 


Xali) — 20 —— sin Tt = ә (25. 
TT J[(250 - 490)? + 12.25] —240 


— 0.0076 sin(7t - 3.127) 
Thus a good approximation to the steady-state response (7.48) is 
X(t) = 0.053 sin(t — 0.003) + 0.027 sin(3t — 0.54) + 1.02 sin(St — 1n) 
+ 0.0076 sin(7t — 3.127) (7.49) 


The graph of this displacement is shown in Figure 7.27, and it appears from this that 
the response has a frequency about five times that of the input. This is because the term 
1.02 sin(5t — 1x) dominates in the response (7.49); this is a consequence of the fact that 
the natural frequency of oscillation of the system is \(K/M) = Srads"', so that it is in 
resonance with this particular harmonic. 

In conclusion, it should be noted that it was not essential to introduce transfer func- 
tions to solve this problem. Alternatively, by determining the particular integral of the 
differential equation (7.41), the steady-state response to an input A sin 00 is determined as 


Е А ѕіп( 007 – о) ОВ 
xs(f) = 2.2 2 25 tan & = ——; 
JOE - Mæ y + Bœ] K- Mo 


giving x,,(f) as in (7.48). The solution then proceeds as before. 


ssn 


30 


E 
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7.5.2 Exercises 


Determine the steady-state current in the circuit of 
Figure 7.28(a) as a result of the applied periodic 
voltage shown in Figure 7.28(b). 


Rz300€ 


[5 
Om 
C=4x 10-6F 


L=0.02H 
(a) 





(b) 


Figure 7.28 (a) Circuit of Exercise 30; 
(b) applied voltage. 


Determine the steady-state response of the mass- 
spring-damper system of Figure 7.29(a) when the 
mass is subjected to the externally applied periodic 
force f(t) shown in Figure 7.29(b). 

What frequency dominates the response, and 
why? 






B-05kgs-! 


[^o 





(b) 


Figure 7.29 (a) Mass-spring-damper system of 
Exercise 31; (b) applied force. 


Determine the steady-state motion of the mass of 
Figure 7.30(a) when it is subjected to the externally 
applied force of Figure 7.30(b). 


2 
K =80N m`! 


fo M =20kg 


L = 0.02 kg s~! 


(a) 


SS 


ғ) 
50 





(b) 


Figure 7.30 (a) Mass-spring-damper system of 
Exercise 32; (b) applied force. 


Determine the steady-state current in the circuit 
shown in Figure 7.31(a) when the applied voltage is 
of the form shown in Figure 7.31(b). 


-5 
icon 10°F 
[ze a] 
04H 

(a) 
e(t) 
100 
о 

0.02 0.04 006 £ 


(b) 


Figure 7.31 (a) Circuit of Exercise 33; (b) applied 
voltage. 
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Complex form of Fourier series 


An alternative to the trigonometric form of the Fourier series considered so far is the 
complex or exponential form. As a result of the properties of the exponential function, 
this form is easily manipulated mathematically. It is widely used by engineers in prac- 
tice, particularly in work involving signal analysis, and provides a smoother transition 
from the consideration of Fourier series for dealing with periodic signals to the con- 
sideration of Fourier transforms for dealing with aperiodic signals, which will be dealt 
with in Chapter 8. 


7.6.1 Complex representation 
To develop the complex form of the Fourier series 
А) = іа, + x à, Cos ntf + У b, sin not (7.50) 
n-l n-l 


representing a periodic function f(t) of period 7, we proceed as follows. Substituting 
the results 


sin not = 2 (e2 — eor) 


— 1 jnot —jnot 
cos nat = 5(e""+ eM) 


into (7.50) gives 
оо jnot —jnot оо jnot —jnot 
Р) = іа, + e te + b e -e _ 

K ) 74 >, а, Э у п 2j 


n-l 


zd — 1 jnot —jnot = 1; jnot —jnot 
=;а,+ У за, (е +е ER Sube -e m 
n-l 


n-l 


= 3+ У (a, jb) e" (a, jb) e" "] (7.51) 
n-l 
Writing 
Co — lag, Cn = 1(а, = ЈЬ,), C — б = (a, + jb) (7.52) 
(7.51) becomes 


К) = с + x c, ei"? + > c.,e 
п=1 п=1 


= jnot x jnot 
=в+ у се + У, с,е 
п=1 


n--l 


= jnot : 0 _ 
= У се)", since суе = су 


п=—оо 
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Thus the Fourier series (7.50) becomes simply 


fü- Y aer" (7.53) 


n-—oo 


which is referred to as the complex or exponential form of the Fourier series expan- 
sion of the function f(f). 

In order that we can apply this result directly, it is necessary to obtain a formula for 
calculating the complex coefficients c,. To do this, we incorporate the Euler formulae 
(7.4) and (7.5) into the definitions given in (7.52), leading to 


Cy = la = i | fit) dt (7.54) 


d 


d+T 


d d 


C= (an -jb,)= || f(t) cos not dt -if f(t) sin nora 


d+T 
= | f(A (cos nøt — j sin nøt) dt 
d 


d+T 
zt | fer dt (7.55) 
T d 


ат 


c= z(a, +jb,)= 1) Л) (соѕ поі + j sin not) dt 


а 


а+т 
= 1 fO e” dt (7.56) 
T d 
From (7.54)-(7.56), it is readily seen that for all values of n 
d+T 
jc | few dt (7.57) 
T d 
Summary 


In summary, the complex form of the Fourier series expansion of a periodic function 
f(t), of period T, is 


= У с.е" (7.53) 
where 
d+T 
PAS 7 | f()e-""dt (n20,51,22,...) (7.57) 
d 
In general the coefficients c, (n 2 0, £1, £2, ... ) are complex, and may be expressed 
in the form 


с, = |e,] e* 
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Example 7.18 


Figure 7.32 Function 
f(t) of Example 7.18. 


Solution 


where |c,|, the magnitude of c,, is given from the definitions (7.52) by 


п? 
| 1 — 1; 
lel = Ga)? + 26,9 = 1а? + 2 


so that 2|c,| is the amplitude of the nth harmonic. The argument 6, of c, is related to the 
phase of the nth harmonic. 


Find the complex form of the Fourier series expansion of the periodic function f(t) 
defined by 


f(ü-cosit (Cn «t«m,  Jf(t*2m)-f() 


ғ) 


-3n -n о n 3n 


A graph of the function f(t) over the interval Зл = / < Зл 15 shown in Figure 7.32. 
Here the period 7 is 27, so from (7.57) the complex coefficients c, are given by 


л т 
с, = zb. eos Ite" dr zd (e)? 4. q 32 erm a, 
2n a 4n Е 


T 
1 -j(n-1/2 -j(n+1/2 
=. (е jo ey e j(n+ rj dt 
-n 


4n 
1 —2е1@"-2? 2 0n*02 T 
^ 4n| jQn-1) j(2n+1) 


_ j (= еј"? E E А 5 " el" eT) 
2n|\ 2n-1 2n+1 2n-1 2n+1 


Now e/? — cos ix 4 jsin $m — j, e?" — —j and e?" = e?" — cos m — (-1)", so that 








= L(t- J uu c j jen 
" 2m\2n-1 2n+1 2n-1 2n+1 
= CU - t= 

m \2Qn+1 2n-1 (4r - 1)n 


Note that in this case c, is real, which is as expected, since the function f(t) is an even 
function of f. 
From (7.53), the complex Fourier series expansion for f(1) is 


E 2(-1)" T 
i= 2 (4r - 1)n ? 


This may readily be converted back to the trigonometric form, since, from the defini- 
tions (7.52), 
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а = 2с, a, = Cy + en b, = c, = cr) 
so that in this particular case 
п+1 _үүї! 
dem. a =2|2 CD -4D b,=0 
T т4п+1| т4т-1 


Thus the trigonometric form of the Fourier series is 


оо п+1 
OE 22у СО cos nt 


л п 4n -1 


which corresponds to the solution to Exercise 1(e). 


Example 7.19 Obtain the complex form of the Fourier series of the sawtooth function f(t) defined by 


22 
fi - 2 


(0<г<2Т),  f(t-2T)-f(0 


Figure 7.33 Function f(t) 
of Example 7.19. 





Solution A graph of the function f(t) over the interval —6T < t < 6T is shown in Figure 7.33. 
Here the period is 27, that is w = n/T, so from (7.57) the complex coefficients c, are 
given by 


1 2T 1 27, 
pL. t лиш gp 21 cinulT qt 
i i| m PPP 


| 2 | 2T 
= ТЇ g PT. T „е (n z 0) 
T! |-jnx (лт) | 
Now e?" 5 e? — ]. so 
1 2 2 2 12 
Cp —; у) —+-1—--`—|=15< (is) 
T inr (nn) (nm) пт 











In the particular case n = 0 


1 2T 1 27, 1 2? 
«-i| лош = | a at =< [17] =2 
ae 2r] T PED 


Thus from (7.53) the complex form of the Fourier series expansion of f(f) is 


oo 


-1 . Я I co . | 
f(t) =2+ Y ES I mt 2+ Y Teme 


n=—% n-l п=—со 
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Noting that j 2 e"?, this result may also be written in the form 


2-15 
HD=a2+= 4 „jant T+n/2) 
SO z у ze 


n-—oo 


As in Example 7.18, the Euler coefficients in the corresponding trigonometric series are 


ds — 20, 7 4, а„=с„+сў=0, b, oe, ret e (21+ 2])]=-= 
nm nm пт 


so that the corresponding trigonometric Fourier series expansion of f(t) is 


4v 1. nnt 
2-7 1 sin 2H 
fü) pls т 


which corresponds to the solution of Example 7.11 when T= 2. 


7.6.2 The multiplication theorem and Parseval's theorem 


Two useful results, particularly in the application of Fourier series to signal analysis, 
are the multiplication theorem and Parseval's theorem. The multiplication theorem 
enables us to write down the mean value of the product of two periodic functions over 
a period in terms of the coefficients of their Fourier series expansions, while Parseval's 
theorem enables us to write down the mean square value of a periodic function, which, 
as we will see in Section 7.6.4, determines the power spectrum of the function. 


Theorem 7.5 The multiplication theorem 


If f(A and g(t) are two periodic functions having the same period T then 


i | fibs()dr- Y са (7.58) 


с n=—co 


where the c, and d, are the coefficients in the complex Fourier series expansions of 
ft) and g(t) respectively. 


Proof Let f(#) and g(t) have complex Fourier series given by 


oo 


fü)s V a clos (7.592) 


п=—оо 


with 


с+Т 
ges i | few"? dT (7.59b) 


c 
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and 
g= X ае” (7.60a) 
with 
с+Т 
= | g(t) eo" dt (7.60b) 
Then 


E | f(t)g(t) dt = E | (E б, pm g(t) dt using (7.59a) 


с с 


с+Т 
= x C, i | g(t) emt assuming term-by-term 
п=— с integration 1s possible 
= using (7.59b) 


= > Cad n 


п=—со 


Since d_, = 1% the complex conjugate of d,, this reduces to the required result: 


п? 


с+Т E 
7 | ftDst)dr- Y, cat 


c п=—оо 


end of theorem 


In terms of the real coefficients a,, b, and o, D, of the corresponding trigonometric 
Fourier series expansions of f(t) and g(t), 


— п2т1ї — . (n2nt 
о ао F jx sin( 7 ) 


= n2nt а . ( n2nt 
g(t) = 200+ O, cos( n )+ 2. B, sin( “22 


and using the definitions (7.52), the multiplication theorem result (7.58) reduces to 











c+T 
i | AOLA dt = у cd, + Coda + у c,d ,, 
п=1 п=1 


С 


= 1да У а, – )Ь,)00, +38) + (а, +), - 1821 


п=1 


giving 


CS 

1 оо 

1 fg dt = 1%д ар i Уу (а,0, + b, D,) 
c n-l 
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Theorem 7.6 Parseval’s theorem 


If f( is a periodic function with period T then 


І | L'Odt- V eet- Y laf (7.61) 


c n-—oo п=—со 


where the c, are the coefficients in the complex Fourier series expansion of f(t). 


Proof This result follows from the multiplication theorem, since, taking e(f) — f(r) in (7.58), 
we obtain 


f М0? У, сс = У 1 


п=—әо n-—eo 


end of theorem 


Using (7.60), Parseval's theorem may be written in terms of the real coefficients a, 
and b, of the trigonometric Fourier series expansion of the function f(f) as 


I | Lf dto 1a; 1Y (a b) (7.62) 


E n=1 


The root mean square (RMS) value frus of a periodic function f(f) of period 7, defined 
by 


LEE | LO dr 


с 


may therefore be expressed in terms of the Fourier coefficients using (7.61) or (7.62). 


Example 7.20 By applying Parseval's theorem to the function 


2t 


JU PST. Ieee) 


considered in Example 7.19, show that 


т^ = У 
=, 2 
n-l 


ou 
= Jr 


Solution From Example 7.19, the coefficients of the complex Fourier series expansion of f(t) are 


a=2 e= (п = 0) 


7.6.3 
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Thus, applying the Parseval’s theorem result (7.61), noting that the period in this case 
is 2T, we obtain 


21 -1 E 
2) [OP ds je Y de € Y lel 
n=1 


0 n=—co 
giving 
2T 4 ВА 2 
2 | 4+2 (2. 
2TJo T xe Us 


which reduces to 


Discrete frequency spectra 


In expressing a periodic function f(t) by its Fourier series expansion, we are decompos- 
ing the function into its harmonic or frequency components. We have seen that if f(A) 
is of period T then it has frequency components at frequencies 
_ 2пт _ z 
Q,- Um. cns (n=1,2,3,...) (7.63) 
where @, is the frequency of the parent function f(t). (All frequencies here are meas- 
ured in rad s !.) 

A Fourier series may therefore be interpreted as constituting a frequency spectrum 
of the periodic function f(t), and provides an alternative representation of the function 
to its time-domain waveform. This frequency spectrum is often displayed by plotting 
graphs of both the amplitudes and phases of the various harmonic components against 
angular frequency @,. A plot of amplitude against angular frequency is called the 
amplitude spectrum, while that of phase against angular frequency is called the phase 
spectrum. For a periodic function f(t), of period T, harmonic components only occur 
at discrete frequencies @,, given by (7.59), so that these spectra are referred to as dis- 
crete frequency spectra or line spectra. In Chapter 8 Fourier transforms will be used 
to define continuous spectra for aperiodic functions. With the growing ability to process 
signals digitally, the representation of signals by their corresponding spectra is an 
approach widely used in almost all branches of engineering, especially electrical engin- 
eering, when considering topics such as filtering and modulation. An example of the 
use of a discrete spectral representation of a periodic function is in distortion measure- 
ments on amplifiers, where the harmonic content of the output, measured digitally, to a 
sinusoidal input provides a measure of the distortion. 
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Figure 7.34 Real 
discrete frequency 
spectrum. 


If the Fourier series expansion of a periodic function f(t), with period T, has been 
obtained in the trigonometric form 


fit) = tay+ 2. à, COS (2828) + 2. b, sin (2824) 


then, as indicated in Section 7.2.2, this may be expressed in terms of the various har- 
monic components as 





fit) = Ay+ A, sin (2m * б.) (7.64) 


п=1 
where 


1 (A2 2 
A= 540, А, = (a5  b;) 


2 


and the @, are determined by 


| b, d, 
sin ф„ = A cos Q, = d 

In this case a plot of A, against angular frequency œ, will constitute the amplitude 
spectrum and that of @, against œ, the phase spectrum. These may be incorporated in 
the same graph by indicating the various phases on the amplitude spectrum as illus- 
trated in Figure 7.34. It can be seen that the amplitude spectrum consists of a series 
of equally spaced vertical lines whose lengths are proportional to the amplitudes of the 
various harmonic components making up the function f(t). Clearly the trigonometric 
form of the Fourier series does not in general lend itself to the plotting of the discrete 
frequency spectrum, and the amplitudes 4, and phases 6$, must first be determined from 
the values of a, and 5, previously determined. 


Amplitude A, 





wag 209 309 409 509 


Frequency c, 


In work on signal analysis it is much more common to use the complex form of the 
Fourier series. For a periodic function f(t), of period 7, this is given by (7.53), with the 
complex coefficients being given by 


c,-|c,]e^ (п = 0, +1, +2,...) 


in which |c,| and @, denote the magnitude and argument of c, respectively. Since in 
general c, is a complex quantity, we need two line spectra to determine the discrete 
frequency spectrum; the amplitude spectrum being a plot of |c,| against @, and the 
phase spectrum that of @, against @,. In cases where c, is real a single spectrum may be 
used to represent the function /(f). Since |c_,| = |c*| =|c,|, the amplitude spectrum will 
be symmetrical about the vertical axis, as illustrated in Figure 7.35. 
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Figure 7.35 Complex 
form of the amplitude 
spectrum. 
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Frequency c, 


Note that in the complex form of the discrete frequency spectrum we have com- 
ponents at the discrete frequencies 0, +@), +20, +30), ... ; that 1s, both positive and 
negative discrete frequencies are involved. Clearly signals having negative frequencies 
are not physically realizable, and have been introduced for mathematical convenience. 
At frequency n@, we have the component e'"^*, which in itself is not a physical signal; 
to obtain a physical signal, we must consider this alongside the corresponding com- 
ponent e "^ at the frequency —n10, since then we have 


e^ + eo! — 2 cosnWot (7.65) 


Example 7.21 Plot the discrete amplitude and phase spectra for the periodic function 
f= (o«r«2T) — fü*27)- ft) 


of Example 7.19. Consider both complex and real forms. 


Solution In Example 7.19 the complex coefficients were determined as 


a=2 ade: (n=+1,+2,+43,...) 
пт 


Thus 
ы=} э (п = 1,2,3,...) 
"o [-2/nm (n2-1,-2,-3,...) 
L -1,2,3,... 
eiu ded 3s) 


The corresponding amplitude and phase spectra are shown in Figures 7.36(a) and (b) 
respectively. 
In Example 7.19 we saw that the coefficients in the trigonometric form of the Fourier 
series expansion of f(t) are 
4 


а) = 4, а, = 0, b, =-— 
пт 


618 FOURIER SERIES 


з 


Alm als op 





О 





—-4a9-309-200 ~w Wg 209 Зо 4wo wo p 3o 400 


Frequency c, Frequency c, 





(b) 


Figure 7.36 Complex discrete frequency spectra for Example 7.21, with «y — n/T: (a) amplitude spectrum; 


(b) phase spectrum. 


Figure 7.37 Real 
discrete frequency 
spectrum for 
Example 7.21 
(corresponding to 


sinusoidal expansion). 


Amplitude A, 


4/n 


2/n 
l/n 





о 


wo 200 3a 40у 509 бо 


Frequency c, 


so that the amplitude coefficients in (7.63) are 


Ae HES dae43 1. 
пт 


leading to the real discrete frequency spectrum of Figure 7.37. 


Since |c,| ? 1 (a; + b3) ^ 1A,, the amplitude spectrum lines in the complex form 
(Figure 7.36) are, as expected, halved in amplitude relative to those in the real repre- 
sentation (Figure 7.37), the other half-value being allocated to the corresponding 
negative frequency. In the complex representation the phases at negative frequencies 
(Figure 7.36b) are the negatives of those at the corresponding positive frequencies. In 
our particular representation (7.64) of the real form the phases at positive frequencies 
differ by + between the real and complex form. Again this is not surprising, since from 
(7.65) we see that combining positive and negative frequencies in the complex form 
leads to a cosinusoid at that frequency rather than a sinusoid. In order to maintain equal- 
ity of the phases at positive frequencies between the complex and real representations, 
a cosinusoidal expansion 


Дә) = &+Ў А, сов {м * б.) (7.66) 


п=1 


Figure 7.38 Real 
discrete frequency 
spectrum for 
Example 7.21 
(corresponding 

to cosinusoidal 
expansion). 


Example 7.22 


Figure 7.39 
Infinite train of 
rectangular pulses 
of Example 7.22. 


Solution 
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Amplitude А, 





Wp 200 Зор Аю 5а 


Frequency c, 


of the real Fourier series is frequently adopted as an alternative to the sinusoidal series 
expansion (7.64). Taking (7.66), the amplitude spectrum will remain the same as for 
(7.68), but the phase spectrum will be determined by 

a 


: b, а, 
510 ф, = T cospe 


n n 


showing a phase shift of } from that of (7.64). Adopting the real representation (7.66), 
the corresponding real discrete frequency spectrum for the function f(t) of Example 
7.21 is as illustrated in Figure 7.38. 


Determine the complex form of the Fourier series expansion of the periodic (period 27) 
infinite train of identical rectangular pulses of magnitude A and duration 2d illustrated 
in Figure 7.39. Draw the discrete frequency spectrum in the particular case when d= $ 
and T=}. 


Ana 
2d 2d 2d 2d 2d 
к— км к—і eI eI 
-AT -3T T -T 5904 т 2T 3T AT t 


Over one period —7 « t « T the function f(f) representing the train is expressed as 
0 (-T<t<-d) 
Kth=4A (-d«t«d) 
0 (d<t<T) 
From (7.57), the complex coefficients c, are given by 
с, = 1. fit) g UT dt = 1 A e VT dg = A zT o UT (n P 0) 
oT |g 2T) 2T | jnnx е 


jnnd/T —jnnd/T : 
-AemMae = 4 ain (024) - dd sinamdiT) Rd) (к=+1,+2,...) 
пт 12 пт Т T  nnd/T 
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In the particular case when n = 0 


T d 
1 1 Ad 
aoe | даке 
ч L| go Lb Т 
so that 
с, = Ad Sinc (=) (пи= 0, +1, +2, ...) 
Т Т 


where the sinc function is defined by 


sin f 
—— (1+0 
sinc = t ( ) 
1 (t=0) 


Thus from (7.53) the complex Fourier series expansion for the infinite train of pulses 


f) is 


Figure 7.40 Discrete 
amplitude spectrum 
for an infinite train of 
pulses when d= $ and 


2- 


Kt) = У, 4 sine ( 4. et 


п=—со 


As expected, since f(t) is an even function, c, is real, so we need only plot the discrete 
amplitude spectrum to represent f(/£). Since the amplitude spectrum is a plot of |c,| 
against frequency n@ , with @, = T/T, it will only take values at the discrete frequency 
values 


42m am 


0, +2, +42 +22., 
pg 


In the particular case d= +, T= 1, 0 = 2n the amplitude spectrum will only exist at 
frequency values 

0, +27, t4m, ... 
Since in this case 

c,—iAsincimm (n—0,*1,412,...) 


noting that sinc 1 nx — 0 when Inm — mn or n — 5m (m — £1, +2, . . . ), the spectrum is 
as shown in Figure 7.40. 


Amplitude 1c, 





-15ogg -10% -5uü -a wo 5а 100% 150 Frequency wy 
-30n -20n -10r -2m 2r 107 20n 30n 


Figure 7.41 
Graph of sinc t. 


7.6.4 
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As we will see in Chapter 8, the sinc function sinc t — (sin f)/t plays an important role 
in signal analysis, and it is sometimes referred to as the sampling function. A graph of 
sinc ¢ is shown in Figure 7.41, and it is clear that the function oscillates over intervals 
of length 2x and decreases in amplitude with increasing t. Note also that the function 
has zeros at t = +ил (п = 1, 2, 3,...). 


Power spectrum 


The average power P associated with a periodic signal f(t), of period T, is defined as 
the mean square value; that is, 


Р= 1, | [AOF dt (1.67) 


For example, if f(t) represents a voltage waveform applied to a resistor then P represents 
the average power, measured in watts, dissipated by a | Q resistor. 
By Parseval’s theorem (Theorem 7.6), 


P= jaytz > (a, b; (7.68) 


n-l 


1 ат 2 2 ат 2 
1 С cos ( m dt = tan, | С (e) dt = 5b, 


d 


Since 





the power in the nth harmonic is 
Р,= 5(а2 + b2) (7.69) 


and it follows from (7.68) that the power of the periodic function ДТ) 15 the sum of the 
power of the individual harmonic components contained in f(t). 
In terms of the complex Fourier coefficients, Parseval's theorem gives 


P= Y lc, (7.70) 


п=—со 


As discussed in Section 7.6.3, the component e" at frequency @, = n@,, @) = 21/T, 
must be considered alongside the component e "^ at the corresponding negative fre- 
quency —@, in order to form the actual nth harmonic component of the function f(A). 
Since |c_,|’ = |c*|? =|c,|’, it follows that the power associated with the nth harmonic is 
the sum of the power associated with e" and e?"^v^ that is, 
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Example 7.23 


Solution 


P, 2 2]c,P (7.71) 


which, since |c,| 2 1 (a7 + b;), corresponds to (7.69). Thus in the complex form half 
the power of the nth harmonic is associated with the positive frequency and half with 
the negative frequency. 

Since the total power of a periodic signal is the sum of the power associated with 
each of the harmonics of which the signal is composed, it is again useful to consider a 
spectral representation, and a plot of |c,|’ against angular frequency œ, is called the 
power spectrum of the function f(t). Clearly such a spectrum is readily deduced from 
the discrete amplitude spectrum of |c,| against angular frequency o,. 


For the spectrum of the infinite train of rectangular pulses shown in Figure 7.39, deter- 
mine the percentage of the total power contained within the frequency band up to the 
first zero value (called the zero crossing of the spectrum) at 107 rad s™'. 


From (7.67), the total power associated with the infinite train of rectangular pulses f(f) is 


T d 
sl 24021 | 4p 
=з] їл dt E dt 


which in the particular case when d = $ and T= į} becomes 
1/10 
P= | А = tA? 
-1/10 
The power contained in the frequency band up to the first zero crossing at 10r rad s™' is 
Р, = су + 2(с{ + с; + сїў + с) 
where 
с, = 1 А ѕіпс int 
That is, 
Р, = LA e EA (sinc? in + sinc? êr + sinc’ ên + sinc” tr) 
= 5 Æ [1 + 2(0.875 + 0.756 + 0.255 + 0.055)] — 14*(0.976) 


Thus P, = 0.976P, so that approximately 97.6% of the total power associated with f(1) 
is contained in the frequency band up to the first zero crossing at lOr rad s '. 


Suppose that a periodic voltage v(t), of period 7, applied to a linear circuit, results 
in a corresponding current i(f), having the same period 7. Then, given the Fourier series 
representation of both the voltage and current at a pair of terminals, we can use the 
multiplication theorem (Theorem 7.5) to obtain an expression for the average power P 
at the terminals. Thus, given 


u(t) = = c emu i(t) = ч d eT 
n n 
п=—оо 


n-—oo 
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the instantaneous power at the terminals is vi and the average power is 


d+T 
1 jdm N * 
Р= |, vi dt = X c,d? 


n-—oo 


or, in terms of the corresponding trigonometric Fourier series coefficients a,, b, and 


On B 
= 1 Po + 3 у (а„о,„ ДЕ b,p,) 
n-l 


7.6.5 Exercises 





34 Show that the complex form of the Fourier series 2 (-n«t«0) 
i iodic functi © K= 
expansion of the periodic function 1 (<t<n) 
f(t) = pg (7X ==. T) ft 4 2n) -f(t) 
EFI) STD (d) f() -|sinz] (<<) 
" fite 27) - f() 
2 E 
2 n jn s . 
f(t) = T + У S(-1y’e" 37 A periodic function /(1), of period 2m, is defined 
п=0 1 within the period -r < t < v by 
Using (7.52), obtain the corresponding (оп <1< 0) 
trigonometric series and check with the S(t) = 
g 1 (0<t<n) 


series obtained in Example 7.5. 
Using the Fourier coefficients of f(t), together with 
35 Obtain the complex form of the Fourier series Parseval's theorem, show that 
expansion of the square wave 


= т? 
fo- 0 (-2<t<0) 2e sT 
1 (0</г<2) | | 
(Note: The Fourier coefficients may be deduced 
ft * 4) 2 fü) from Example 7.7 or Exercise 35.) 


Using (7.52), obtain the corresponding 
trigonometric series and check with the 
series obtained in Example 7.7. 


38 (a) Show that the Fourier series expansion of the 
periodic function 


f()-500m (0-«1:« à) 
36 Obtain the complex form of the Fourier i 
series expansion of the following periodic ft + 9) =fO 


functions. may be expressed as 


(а) LS (7x «t « 0) 
t (0<t<n) 


e (b) Using (7.62), estimate the RMS value of f(A by 
(b) f(t) f snot (0<t< sf ) (i) using the first four terms of the Fourier 
series; 
(ii) using the first eight terms of the Fourier 
At+T) =f, T-22m/o series. 


f(t) =5n- 10 У 1 sin 100nnf 
n 
n-l 


GT<t<T) 
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(c) Obtain the true RMS value of f(t), and hence (a) Obtain expressions for the coefficients c, of the 
determine the percentage errors in the complex Fourier series representation of v(f), 
estimated values obtained in (b). and write down the values of the first five 

non-zero terms. 
39 A periodic voltage v(t) (in V) of period 5 ms and (b) Calculate the power associated with each of 
specified by the first five non-zero terms of the Fourier 
expansion. 
v(t) = m (0 < £ < 1.25 ms) (c) Calculate the total power delivered to the 
0 (1.25ms < t< 5ms) 15 О resistor. 


(d) What is the percentage of the total power 
delivered to the resistor by the first five 
is applied across the terminals of a 15 Q resistor. non-zero terms of the Fourier series? 


v(t + 5 ms) = v(t) 


Orthogonal functions 


As was noted in Section 7.2.2, the fact that the set of functions {1, cos wt, sin wt, 
..., cosnamt, ѕіп пої, .. . } 15 ап orthogonal set of functions on the interval d S t S d 
+ T was crucial in the evaluation of the coefficients in the Fourier series expansion of a 
function (ft). It is natural to ask whether it is possible to express the function f(f) as a 
series expansion in other sets of functions. In the case of periodic functions f(t) there 
is no natural alternative, but if we are concerned with representing a function f(t) only 
in a finite interval f, « f « f, then a variety of other possibilities exist. These possibil- 
ities are drawn from a class of functions called orthogonal functions, of which the 
trigonometric set (1, cos Of, sin 0f, ... , cos not, sin Ot) is a particular example. 


7.7.1 Definitions 


Two real functions f(f) and g(f) that are piecewise-continuous in the interval 4 S t < h, 
are said to be orthogonal in this interval if 


| (Ов) 4 = 0 


ti 


A set of real functions ,(4), ф(0), ... = {,(4}, each of which is piecewise-continuous 
on f, «€ f « h, is said to be an orthogonal set on this interval if ¢,(f) and @,,(f) are 
orthogonal for each pair of distinct indices n, m; that is, if 


| Apn = 0 (n#m) (7.72) 


t 


We shall also assume that no member of the set {@,(4)} is identically zero except at a 
finite number of points, so that 


| а (a= 1,2, 35554) (7.73) 


ti 


where ¥,, (m= 1, 2,...) are all non-zero constants. 


Example 7.24 


Example 7.25 


Solution 
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An orthogonal set {@,()} is said to be orthonormal if each of its components is also 
normalized; that is, y, 2 1 (m 2 1, 2, 3, ... ). We note that any orthogonal set {@,(7)} 
can be converted into an orthonormal set by dividing each member 9,,(¢) of the set by \7,,. 


Since (7.6)—(7.10) hold, 
(1, cos t, ѕіп 7, соѕ 21, sin2t,..., cosnt, sinnt} 


is an orthogonal set on the interval d S t S d + 2n, while the set 





1 cost sint cosnt sinnt 
\(2n)’ үт , үт > s» үл , үл 


forms an orthonormal set on the same interval. 
The latter follows since 


а+2п 2 
| —1— dt=1 
, 0л) 
d+2n 2 d42m , , 2 
| (вл) u-| (a) dt=1 (n=1,2,3,...) 
ү а yu 


d 


The definition of orthogonality considered so far applies to real functions, and has 
to be amended somewhat 1f members of the set {@,(4)} are complex functions of the real 
variable f. In such a case the set {@,(t)} is said to be an orthogonal set on the interval 
t <t=<tif 


| фарди) а = ° NU (7.74) 
Y (n2m) 


where @*(t) denotes the complex conjugate of 6$, (f). 


ti 


Verify that the set of complex exponential functions 
fer") (n=0,+1, +2, +3,...) 


used in the complex representation of the Fourier series is an orthogonal set on the 
interval 0 < f « 2T. 


First, 
2T 2s jp 
| eim] gr i =0 (n#0) 
o jnn 
0 
since e" = e° = 1. Secondly, 


2T 2T 2T 
етт (е”"Ту* dt= етли dt = ies 20 (n + т) 


А 0 jn - m)n А 
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and, when n = m, 


2T 2T 
| em/7 (gjmi/Tys dt = | 1 df zT 


0 0 


Thus 


2T 
| e""Tl1dr20 (n#0) 


0 


2T 
gimiT (eimT ys dt= 0 (п = т) 
0 2T (nem) 


and, from (7.74), the set is an orthogonal set on the interval 0 « f « 2T. 


The trigonometric and exponential sets are examples of orthogonal sets that we have 
already used in developing the work on Fourier series. Examples of other sets of ortho- 
gonal functions that are widely used in practice are Legendre polynomials, Bessel func- 
tions, Hermite polynomials, Laguerre polynomials, Jacobi polynomials, Tchebyshev 
(sometimes written as Chebyshev) polynomials and Walsh functions. Over recent years 
wavelets are another set of orthogonal functions that have been widely used, particularly 
in applications such as signal processing and data compression. 


Generalized Fourier series 


Let ($,(f)) be an orthogonal set on the interval /, « f « t, and suppose that we wish to 
represent the piecewise-continuous function (f) in terms of this set within this interval. 
Following the Fourier series development, suppose that it is possible to express f(f) as 
a series expansion of the form 


Р = У cO t) (7.75) 


n-l 


We now wish to determine the coefficients c,, and to do so we again follow the Fourier 
series development. Multiplying (7.75) throughout by $,(f) and integrating term by 
term, we obtain 


| fon a= Yc, | 6.00040) dt 


n-l ti 


which, on using (7.72) and (7.73), reduces to 


| SOG AD) dt = Cnt 


giving 
с, = - | f(o,(t)dt (n21,2,3,...) (7.76) 


n 
1 


7.7.3 
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Summary 


Summarizing, if f(f) is a piecewise-continuous function on the interval 4 S £ S h 
and {@,(¢)} is an orthogonal set on this interval then the series 


Л) = У с,ф00) 
п=1 
is called the generalized Fourier series of /(/) with respect to the basis set {¢,(£)}, 
and the coefficients c,, given by (7.76), are called the generalized Fourier coeffi- 
cients with respect to the same basis set. 


A parallel can be drawn between a generalized Fourier series expansion of a function 
f(t) with respect to an orthogonal basis set of functions {@,(t)} and the representation 
of a vector fin terms of an orthogonal basis set of vectors vj, v), ... , v, as 


f^0w t... 0, 


where 


fv fv 
a, == = 
Vi Vi |v, | 


There is clearly a similarity between this pair of results and the pair (7.75)-(7.76). 


Convergence of generalized Fourier series 


As in the case of a Fourier series expansion, partial sums of the form 


N 


Р) = У с,ф(0) (7.77) 


п=1 


can be considered, and we wish this representation to be, in some sense, a 'close 
approximation’ to the parent function f(t). The question arises when considering such 
a partial sum as to whether choosing the coefficients c, as the generalized Fourier 
coefficients (7.76) leads to the ‘best’ approximation. Defining the mean square error 
Ey between the actual value of f(f) and the approximation F(t) as 


5 
к= р; | дока 
h-t 5 

it can be shown that E, is minimized, for all V, when the coefficients c, are chosen 
according to (7.76). Thus in this sense the finite generalized Fourier series gives the best 
approximation. 

To verify this result, assume, for convenience, that the set {@,(f)} is orthonormal, 
and consider the Nth partial sum 


N 


Fy() 7 Y. 6,0, 


n-l 
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where the €, are to be chosen in order to minimize the mean square error Ey. Now 


оов) л-г dt 


ti n=1 


= [ Fo 23a] nenas | eo dt 


ti п=1 ti 


2 N N 
= | го-о у сту 
п=1 


ti n-l 


since {@,(f)} is an orthonormal set. That is, 


(2 - 1)Е, = | foa Vets er (7.78) 


=1 


which is clearly minimized when ё, = с,. 
Taking €, — c, in (7.78), the mean square error Ey in approximating f(t) by Fy(t) of 
(7.73) is given by 


1 2 E. d 
Ey = — f()ndt- 5 c, 
T sd | 2 


n=1 


if the set {@,(4)} is orthonormal, and is given by 


1 n 2 у 2 
Ey 2 —— t) dt - 7.79 
N b RS ti | f ( ) у YnCn ( ) 


n-l 


if the set {0,(4)} is orthogonal. 
Since, by definition, Ey is non-negative, it follows from (7.79) that 


| f(t) dt= Y Үс, (7.80) 


h n-l 


a result known as Bessel's inequality. The question that arises in practice is whether or 
not Ey > 0 as N 5 ee, indicating that the sum 


x cO, (t) 


п=1 


converges to the function (f). If this were the case then, from (7.79), 


| ша = У е (7.81) 


1 


which is the generalized form of Parseval’s theorem, and the set {@,(f)} is said to 
be complete. Strictly speaking, the fact that Parseval’s theorem holds ensures that the 
partial sum F(t) converges in the mean to the parent function f(t) as N — ee, and 
this does not necessarily guarantee convergence at any particular point. In engineer- 
ing applications, however, this distinction may be overlooked, since for the functions 


Example 7.26 
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met in practice convergence in the mean also ensures pointwise convergence at points 
where f(t) is convergent, and convergence to the mean of the discontinuity at points 
where f(t) is discontinuous. 


The set {1, cosż, sinź, ... , cos nt, sinnt} is a complete orthogonal set in the interval 
d x t x d * 2n. Following the same argument as above, it is readily shown that for a 
function f(¢) that is plecewise-continuous on d « f X 4 + 27 ће mean square error 
between /(7) and the finite Fourier series 


N N 
Fy(t) = tão + * á, cos nt x b, sin nt 
п=1 п=1 


is minimized when à, à, and 5, (n — 1, 2, 3, . . . ) are equal to the corresponding Fourier 
coefficients dp, a, and b, (n 2 1, 2, 3, .. . ) determined using (7.4) and (7.5). In this case 
the mean square error Ey is given by 


а+2п N 
si f ()dt- n| aj * V (a; by) 


2n d n-l 


Bessel’s inequality (7.80) becomes 


а+2п N 
| FA dt> n| jaot Y (ai by) 


d п=1 


and Parseval's theorem (7.81) reduces to 
1 d+27 E 
Er | fo 2 ja e$ (ae b) 
d п=1 


which conforms with (7.62). Since, in this case, the basis set is complete, Parseval’s 
theorem holds, and the Fourier series converges to f(t) in the sense discussed above. 


7.7.4 Exercises 


40 The Fourier series expansion for the periodic square Determine the mean square error corresponding to 
wave approximations to f(t) based on the use of one term, 
two terms and three terms respectively in the series 
— T<t<0 expansion. 
Ад = | , i 
(0 « t «€ n) : 
41 The Legendre polynomials P,(r) are generated by 
f(t+ 2m) =f) the formula 
P()e l-9 g-1y («20,12,...) 
18 п = n n CU gy legna ries 
2°n! dt 


29% 4 i - 
А) = > ШҮҮ, sin(2n - 1)t 


and satisfy the recurrence relationship 


nP(t) - Qn — VP, (0) - (n - 15, «(0 
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(a) Deduce that 
P(t-1, P(0)-t 


Р,(®) =} (3?— 1), Ру()= (52 – 30) 


(b) Show that the polynomials form an orthogonal 
set on the interval (—1, 1) and, in particular, that 


| P (ÐP (t) dt 


-{° (n z m) 


2/(2n +1) (n=m;m=0,1,2,...) 


(c) Given that the function 
-1 (—1<{<0) 
fO=) 0 (t=0) 
1 (0<t<1) 


is expressed as a Fourier-Legendre series 


expansion 


f()- Y PO 


r=0 


determine the values of c,, c;, c; апа сз. 


(d) Plot graphs to illustrate convergence of the 
series obtained in (c), and compare the mean 
square error with that of the corresponding 


Fourier series expansion. 


42 Repeat parts (c) and (d) of Exercise 41 for the 


function 


230 (-l<x<0) 
fe fo (0<х<1) 


43 Laguerre polynomials L,(t) are generated by the 


formula 
= 1d” Hat -— 
Lt) 2e ("е") (n20,1,2,...) 
df 


and satisfy the recurrence relation 


L(t) » 2n -1—- 2L, 4() — (n – 101,0) 
(n22,3,...) 


These polynomials are orthogonal on the 


interval 0 « t « oo with respect to the weighting 


function e, so that 


(n € m) 


| e"L(0)L,(t) dt — f 
0 (n? (n=m) 


(a) Deduce that 
100 = 1, 100 =1-# 
140 =2- 4+ 
L,(t) = 6 – 181+ 92 – Р 
(b) Confirm the above orthogonality result in the 
case of Lo, Li, L and L}. 


(c) Given that the function f(t) is to be 
approximated over the interval 0 « t < eo by 


fo V, cL 


r=0 


show that 





C= : | fE eL, À dt 
(1) Јо 


(к= 0,1, 2,...) 


(Note: Laguerre polynomials are of particular 
importance to engineers, since they can 
be generated as the impulse responses of 
relatively simple networks.) 


Hermite polynomials H,(f) are generated by the 
formula 


2 n 2 
H(t) - (71) e^? d- d 


(n20,1,2,...) 
and satisfy the recurrence relationship 
A(t) = tH, (t) - (n — 1H, xt) 
(n22,3,...) 


The polynomials are orthogonal on the interval 
—eo « f « co with respect to the weighting 
function e^, so that 


| «nonna ае 


(27)! (и= т) 
(a) Deduce that 
H(t)-1, H()n-t 
Н) =?-1, Нд) = -—3t 
H(t) = t* — 6? +3 


(b) Confirm the above orthogonality result for 
H, H, H, and M}. 


45 


(c) Given that the function f(r) is to be 
approximated over the interval —e < £ < œ by 


ft) - Y, eto 


r=0 


show that 


e,-- | едн) й 


=) 


Tchebyshev polynomials 7;(f) are generated by the 
formula 


46 
Tf) 2 cos(ncos!f) (n20,1,2,...) 
or 
[n/2] п! 
T. = 1 ва m [= 2yr,n-2r 
Ko 2 (- bos go ene 
(n0, 1,2,...) 
where 
wa- | n/2 (even n) 
(n-1)/2 (odd n) 


They also satisfy the recurrence relationship 
TO= AD- TA) (n92,3,...) 


and are orthogonal on the interval -1 € t 1 
with respect to the weighting function 1/\(1 — f°), 
so that 


1 0 (т=п) 
| DOT) AOFI = 417 (т=п * 0) 
| (1-2) 
л (т=п = 0) 


(a) Deduce that 


T(t) 1l, T()-t 
Т0) =22- 1, 7,0) = 40 —3t 
T(t) 2 8 – 802 + 1 


T(t) = 16¢° — 2087 + 5t 


(b) Confirm the above orthogonality result for 
То, Tj, T; and T}. 
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(c) Given that the function f(f) is to be 
approximated over the interval -1 « f « 1 by 


ло) = У, от) 


т=0 


show that 
cet | LOT а 
х) 30-0) 


«ai MOTO a, (21,2...) 
T adit) 


With developments in digital techniques, Walsh 
functions W,(t) have become of considerable 
importance in practice, since they are so easily 
generated by digital logic circuitry. The first four 
Walsh functions may be defined on the interval 
0=<t<Tby 


1 


WA) = t (0<tXT) 

IAT (0xt«5 iT) 
no-l | 

ІТ GT«tsT) 

{Т ХО ЕТ, ТТТ) 
W(t) = 

/|T GT<t<ŻT) 

W(t) = 


| Пт оет терет теве Т) 
000 (=т=т, Т<гє< Т) 


(a) Plot graphs of the functions W,(t), W,(t), W(t) 
and W,(t), and show that they are orthonormal 
on the interval 0 = ¢ < 7. Write down an 
expression for W,,(t). 

(b) The Walsh functions may be used to obtain 
a Fourier—Walsh series expansion for a 
function f(f), over the interval 0 <= г < T, 
in the form 


= У oW) 
r=0 
Illustrate this for the square wave of 
Exercise 40. What is the corresponding mean 
square error? Comment on your answer. 
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7.8 Engineering application: [Bv] TUS functions 


Many control systems containing a nonlinear element may be represented by the block 
diagram of Figure 7.42. In practice, describing function techniques are used to analyse 
and design such control systems. Essentially the method involves replacing the non- 
linearity by an equivalent gain N and then using the techniques developed for linear 
systems, such as the frequency response methods of Section 5.8. If the nonlinear ele- 
ment is subjected to a sinusoidal input e(r) = X sin æt then its output z(t) may be repre- 
sented by the Fourier series expansion 


Figure 7.42 Nonlinear Nonlinear element 
control system. Linear 


element 






2(t) = ар + 2 a, COS NOt + X b,, sin not 


n-l n-l 


-lay* >, A, sin(not + ф„) 
п=1 
мі А, = ((а2 + 02) апа ф, = {ап (а,/Ь,). 
The describing function N(X) of the nonlinear element is then defined to be the 
complex ratio of the fundamental component of the output to the input; that is, 


N(X) = м g^ 


with N(X) being independent of the input frequency o if the nonlinear element is 
memory-free. 

Having determined the describing function, the behaviour of the closed-loop system 
is then determined by the characteristic equation 


1+ MX)G(j@) =0 


If a combination of X and c can be found to satisfy this equation then the system is 
capable of sustained oscillations at that frequency and magnitude; that is, the system 
exhibits limit-cycle behaviour. In general, more than one combination can be found, 
and the resulting oscillations can be a stable or unstable limit cycle. 

Normally the characteristic equation is investigated graphically by plotting G( jc) 
and —1/N(X ), for all values of X, on the same polar diagram. Limit cycles then occur at 
frequencies and amplitudes corresponding to points of intersection of the curves. Some- 
times plotting can be avoided by calculating the maximum value of V(X) and hence the 
value of the gain associated with G(s) that will just cause limit cycling to occur. 

Using this background information, the following investigation is left as an exercise 
for the reader to develop. 
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Figure 7.43 (a) Relay; Output 
(b) relay with dead 
zone. 






L 





Figure 7.44 Nonlinear 
system of exercise. 





(a) Show that the describing functions N,(X) and N,(X ) corresponding respectively 
to the relay (on-off nonlinearity) of Figure 7.43(a) and the relay with dead zone 
of Figure 7.43(b) are 


МОХ) = 45, мо) = 22 h-e 


(b) For the system of Figure 7.44 show that a limit cycle exists when the nonlinearity 
is the relay of Figure 7.43(a) with L = 1. Determine the amplitude and frequency 
of this limit cycle. 

In an attempt to eliminate the limit-cycle oscillation, the relay is replaced by 
the relay with dead zone illustrated in Figure 7.43(b), again with L = 1. Show that 
this allows our objective to be achieved provided that h > 10/31. 


7.9 Review exercises (1-20) 


l A periodic function /(/) is defined by 2 Determine the full-range Fourier series expansion 
of the even function f(t) of period 27 defined by 
f (0<t<n) 


О(г Олт) m=] 
S(t + 2m) =f) 


Obtain a Fourier series expansion of f(t) and 


m-l 2y (0 <1<!n) 


in-f) Gu <t<n) 


To what value does the series converge at t= im? 


deduce that 3 A function f(!) is defined for 0 < t < 1T by 
= t (0 s £ x 1T) 
i= + wa 1 E 
a, ep Ct = = 
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Sketch odd and even functions that have a period 
T and are equal to f(t) for 0 S t 1T. 


(a) Find the half-range Fourier sine series of f(t). 
(b) To what value will the series converge for 
t=-}T? 
(c) What is the sum of the following series? 8 


= 1 
S= — 
2 pum 


Prove that if g(x) is an odd function and f(x) an even 
function of x, the product g(x) [c + f(x)] is an odd 
function if c is a constant. 

A periodic function with period 27 is defined by 


F(8) - i 6(r? — 8?) 9 


in the interval -n < 0 < m. Show that the Fourier 
series representation of the function is 


оо п+1 
F(0)- m Ch sin nO 
n-l n 


A repeating waveform of period 27 is described by 
n+t (=n < t< in) 10 
KOSS 0 


t-T 


(in < t <in) 

Qn « t5 n) 

Sketch the waveform over the range t = —27 to 
t = 2m and find the Fourier series representation 
of f(t), making use of any properties of the 


waveform that you can identify before any 
integration is performed. 


A function f(x) is defined in the interval 


-| xxl by ПИ 
1/22 (“= <х<е 
fo еа 
0 (=== =е е) 


Sketch a graph of f(x) and show that a Fourier 
series expansion of f(x) valid in the interval 
—] «x«l is given by 





ПС в X Sn Лё oos np 
+ NTE 
= 


Show that the half-range Fourier sine series for the 
function 


no=(1-4) (0<t<n) 


is 


= $i Et LCD sium 


n=1 пт 


Find a half-range Fourier sine and Fourier cosine 
series for f(x) valid in the interval 0 < x < m 
when f(x) is defined by 


1 
X (0 x x s 5m) 


f= 


norm nm) 


Sketch the graph of the Fourier series obtained 
for -27 < x <S 2m. 


A function f(x) is periodic of period 27 and is 
defined by f(x) 2 e* (x «€ x « mn). Sketch the 
graph of f(x) from x = —27 to x = 2m and prove 
that 


D Le y, CI (cos nx - n sin ni) 
T 


1+п 


п=1 


A function f(t) is defined on 0 < t < x by 
f()-n-t 

Find 

(a) a half-range Fourier sine series, and 


(b) a half-range Fourier cosine series for f(t) 
valid for 0 « t < m. 


Sketch the graphs of the functions represented 
by each series for -27 < Г < 27. 


Show that the Fourier series 


1 4 x^ cos(2n - 1)f 
Ao 2 
n4 (2n-1) 


n=1 


represents the function f(t), of period 21, 
given by 


ло} t (0<t<n) 


= (er S7 =) 


Deduce that, apart from a transient component 
(that is, a complementary function that dies away 
as t — co), the differential equation 


$t exc fir) 


has the solution 


x2in-iy cos(2n - m IT 1) 
T (a= bibe (uns OT 


14 
12 Show that if f(A) is a periodic function of period 
2n and 
t/T (0 <#<л) 
Ди) = 
(2n-t)m (m-«t-«2m) 
then 
д0 =1- 5 у SUR DY 15 
поа 
Show also that, when @ is not an integer, 
1 
у= (1 - соѕ 0) 
20 
=) slar +1)t- cos wt 
4 (20 + 1) [0 - (2n+1)] 
satisfies the differential equation 
diay ee 
SRI OBEN) 
dt 
subject to the initial conditions y = dy/dt = 0 at 16 
p=), 
13 (а) A periodic function f(t), of period 27, is 
defined in =r < t < x by 
= (ee Se =) 
дд=1 | 
t (0<t<n) 
Obtain a Fourier series expansion for f(t), and 
from it, using Parseval’s theorem, deduce that 
m elu 
MU 
(b) By formally differentiating the series obtained 
in (a), obtain the Fourier series expansion of 
the periodic square wave 
= =т= 0) 7 


g(t)=) 0 (t=0) 
Т 0 = т) 


g(t + 21) = g(t) 
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Check the validity of your result by 
determining directly the Fourier series 
expansion of e(t). 


A periodic function (t), of period 27, is defined 
in the range л < t < л Бу 


f(t) = sin 17 


Show that the complex form of the Fourier series 
expansion for f(f) is 


ie) = > m 


(a) Find the Fourier series expansion of the 
voltage v(t) represented by the half-wave 
rectified sine wave 


(0 == 57) 


10 sin(2nt/T) 
“=| | 
Ст т) 


v(t * T) - 0) 


(b) Ifthe voltage v(t) in (a) is applied to a 
10 Q resistor, what is the total average power 
delivered to the resistor? What percentage 
of the total power is carried by the second- 
harmonic component of the voltage? 


The periodic waveform f(t) shown in Figure 7.45 
may be written as 





} t + + 
=5) 


n -4n -3n -2n -n On Sa! 





Figure 7.45 Waveform f(t) of Review 
exercise 16. 


fü) -1- gA) 
where g(t) represents an odd function. 


(a) Sketch the graph of g(t). 

(b) Obtain the Fourier series expansion for g(t), 
and hence write down the Fourier series 
expansion for f(t). 


Show that the complex Fourier series expansion 
for the periodic function 

HOSE (OSES 2) 

S(t + 2m) =f) 
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(a) 


(b) 


(a) 


(b) 


(d) 


f)» x Y, i^ 


п=—со 


A square-wave voltage v(t) of period T is 
defined by 


v(t) - 
1 


v(t - T) - wt) 


(27 <1< 0) 


(0 « t « 1T) 20 


Show that its Fourier series expansion is 


given by 
4 х= sin[(4n - 2)nt/T] 
)=- 
un л У, 205 


Find the steady-state response of the circuit 
shown in Figure 7.46 to the sinusoidal input 
voltage 


v (t) 7 sin ot 


and hence write down the Fourier series 
expansion of the circuit's steady-state response 
to the square-wave voltage v(f) in (a). 


IE 


Figure 7.46 Circuit of Review exercise 18. 


vin) (~) 


ІН 


Defining the nth Tchebyshev polynomial by 
T.(f) 2 cos(n cos ! f) 


use Euler's formula cos 0 — 1 (e/* 4 e°) 
to obtain the expansions of t^* and /?*! 
in Tchebyshev polynomials, where k is a 
positive integer. 


Establish the recurrence relation 
T(t) = 217,10 — T,20 
Write down the values of 7)(t) and 7,(t) from 


the definition, and then use (b) to find 7;(r) and 


Т,(0). 
Express t° — 51 + 72 + 6 – 8 іп Тсһебуѕћеу 
polynomials. 


(e) Find the cubic polynomial that approximates 
to 


P—5t*+ 7 +6t—8 


over the interval (—1, 1) with the smallest 
maximum error. Give an upper bound for 
this error. Is there a value of t for which this 
upper bound is attained? 


The relationship between the input and output of 
a relay with a dead zone A and no hysteresis is 


shown in Figure 7.47. Show that the describing 
function is 


МЕ A = | in 


for an input amplitude x;. 


Output M 





Figure 7.47 Relay with dead zone of Review 
exercise 20. 


If this relay is used in the forward path of 
the on—off positional control system shown in 
Figure 7.48, where the transfer function 


TEER 
С З) 


characterizes the time constant of the servo-motor, 
and the inertia and viscous damping of the load, 
show that a limit-cycle oscillation will not occur 
provided that the dead zone in the relay is such 
that 


4MK TiT, 
л 1T,+T; 


A> 








Figure 7.48 Positional control system of Review 
exercise 20. 
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8.1 


8.2 


8.2.1 


Introduction 


In Chapter 7 we saw how Fourier series provided an ideal framework for analysing the 
steady-state response of systems to a periodic input signal. In this chapter we extend 
the ideas of Fourier analysis to deal with non-periodic functions. We do this through 
the introduction of the Fourier transform. As the theory develops, we shall see how the 
complex exponential form of the Fourier series representation of a periodic function 
emerges as a special case of the Fourier transform. Similarities between the transform 
and the Laplace transform, discussed in Chapter 5, will also be highlighted. 

While Fourier transforms first found most application in the solution of partial 
differential equations, it is probably true to say that today Fourier transform methods 
are most heavily used in the analysis of signals and systems. This chapter is therefore 
developed with such applications in mind, and its main aim is to develop an understand- 
ing of the underlying mathematics as a preparation for a specialist study of application 
areas in various branches of engineering. 

Throughout this book we draw attention to the impact of digital computers on engin- 
eering and thus on the mathematics required to understand engineering concepts. While 
much of the early work on signal analysis was implemented using analogue devices, the 
bulk of modern equipment exploits digital technology. In Chapter 5 we developed the 
Laplace transform as an aid to the analysis and design of continuous-time systems 
while in Chapter 6 we introduced the z and 9) transforms to assist with the analysis and 
design of discrete-time systems. In this chapter the frequency-domain analysis intro- 
duced in Chapter 5 for continuous-time systems is consolidated and then extended to 
provide a framework for the frequency-domain description of discrete-time systems 
through the introduction of discrete Fourier transforms. These discrete transforms pro- 
vide one of the most advanced methods for discrete signal analysis, and are widely used 
in such fields as communications theory and speech and image processing. In practice, 
the computational aspects of the work assume great importance, and the use of appro- 
priate computational algorithms for the calculation of the discrete Fourier transform is 
essential. For this reason we have included an introduction to the fast Fourier transform 
algorithm, based on the pioneering work of J. W. Cooley and J. W. Tukey published 
in 1965, which it is hoped will serve the reader with the necessary understanding for 
progression to the understanding of specialist engineering applications. 

An additional engineering application section has been included in this new edition. 
In this we discuss the discrete-time Fourier transform to provide the means of describ- 
ing the so-called direct design method for digital filters which is based on the use of the 
desired frequency response, without using an analogue prototype design. This naturally 
leads to considering ‘windowing’ and a brief introduction to this topic is included. 


The Fourier transform 


The Fourier integral 


In Chapter 7 we saw how Fourier series methods provided a technique for the 
frequency-domain representation of periodic functions. As indicated in Section 7.6.3, 
in expressing a function as its Fourier series expansion we are decomposing the function 
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Window 





-r O ar ; 
«—— T—— 
Figure 8.1 The view of f(r) through a window of Figure 8.2 The periodic function g(r) based on the 
length T. *windowed' view of f(t). 


into its harmonic or frequency components. Thus a periodic function (t), of period T’, 
has frequency components at discrete frequencies 


0, = 211 =n@, (n=0,1,2,3,...) 

where @, is the fundamental frequency, that is the frequency of the parent function f(f). 
Consequently we were able to interpret a Fourier series as constituting a discrete fre- 
quency spectrum of the periodic function f(A), thus providing an alternative frequency- 
domain representation of the function to its time-domain waveform. However, not all 
functions are periodic and so we need to develop an approach that will give a similar 
representation for non-periodic functions, defined on —ee « f « œ. One way of achiev- 
ing this is to look at a portion of a non-periodic function f(t) over an interval T, by 
imagining that we are looking at a graph of f(A) through a ‘window’ of length T, and 
then to consider what happens as T gets larger. 

Figure 8.1 depicts this situation, with the window placed symmetrically about the 
origin. We could now concentrate only on the ‘view through the window’ and carry out 
a Fourier series development based on that portion of f(t) alone. Whatever the beha- 
viour of f(f) outside the window, the Fourier series thus generated would represent the 
periodic function defined by 


TE P (ltl < T) 
ft-nT) (iQn- DT « [t| « 1On* DT) 


Figure 8.2 illustrates g(t), and we can see that the graphs of f(t) and g(t) agree on the 
interval (—} 7, } 7). Note that this approach corresponds to the one adopted in Section 
7.3 to obtain the Fourier series expansion of functions defined over a finite interval. 

Using the complex or exponential form of the Fourier series expansion, we have 
from (7.53) and (7.57) that 


g(t) = У G, e% (8.1) 


n=—% 


with 


T/2 . 
a | g(t) e " dt (8.2) 


-T/2 
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and where 
@,= 2т/Т (8.3) 


Equation (8.2) in effect transforms the time-domain function g(t) into the associated 
frequency-domain components G,, where n is any integer (positive, negative or zero). 
Equation (8.1) can also be viewed as transforming the discrete components G, in the 
frequency-domain representation to the time-domain form g(f). Substituting for G, in 
(8.1), using (8.2), we obtain 


- Т/2 
_ 1 —1п®уг jnagt 

і) = = сате 8.4 
g(t) Ej g(te (8.4) 

na -T12 

The frequency of the general term in the expansion (8.4) is 
208 nn = O, 
T 


and so the difference in frequency between successive terms is 
2T 2T 
— + 1 = = 2- -= Ло 
Т е )- п] Т 


Since A@= @p, we can express (8.4) as 


ка T/2 | 
#0 = У, z | (7) ma el Aq (8.5) 
grs 2r -T/2 
Defining G(jq@) as 
T/2 
G(jo) = | (т) еі" йт (8.6) 
—T/2 
we have 
1 = -jOrn : 
80) = 5- V e "" Go) A0 (8.7) 


п=—о 


As T — œ, our window widens, so that g(t) = f(t) everywhere and A@ — 0. Since we 
also have 


оо 


"NN E jatar = Е jot my 
lim 792 G(j@,)A@ = ij e" G(jo)do 


it follows from (8.7) and (8.6) that 


f(t) = | 2 e" f(t) mas do (8.8) 
MET P 


The result (8.8) is known as the Fourier integral representation of f(t). A set of 
conditions that are sufficient for the existence of the Fourier integral is a revised form 


Theorem 8.1 
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of Dirichlet's conditions for Fourier series, contained in Theorem 7.2. These conditions 
may be stated in the form of Theorem 8.1. 


Dirichlet's conditions for the Fourier integral 
If the function f(A) is such that 
(a) it is absolutely integrable, so that 


| 1014 < = 


(that is, the integral is finite), and 
(b) it has at most a finite number of maxima and minima and a finite number of 
discontinuities in any finite interval 


then the Fourier integral representation of f(t), given in (8.8), converges to f(t) at all 
points where f(f) is continuous and to the average of the right- and left-hand limits of 
f(t) where f(t) is discontinuous (that is, to the mean of the discontinuity). 


end of theorem 


As was indicated in Section 7.2.9 for Fourier series, the use of the equality sign in 
(8.8) must be interpreted carefully because of the non-convergence to f(t) at points of 
discontinuity. Again the symbol ~ (read as ‘behaves as’ or ‘represented by’) rather than 
= is frequently used. 

The absolute integrable condition (a) of Theorem 8.1 implies that the absolute area 
under the graph of y = f(A) is finite. Clearly this is so if f(t) decays sufficiently fast with 
time. However, in general the condition seems to imply a very tight constraint on 
the nature of f(), since clearly functions of the form f(t) = constant, f(t) = е“, (À = e”, 
f(t) = sin wt, and so on, defined for —ee « t < оо, do not meet the requirement. In 
practice, however, signals are usually causal and do not last for ever (that is, they only 
exist for a finite time). Also, in practice no signal amplitude goes to infinity, so con- 
sequently no practical signal f(t) can have an infinite area under its graph y = f(t). Thus 
for practical signals the integral in (8.8) exists. 

To obtain the trigonometric (or real) form of the Fourier integral, we substitute 


e i970 — cos e(t — t) -j sin e(t — t) 


in (8.8) to give 
f(t) = ij | AD Icos olt- t) —jsino(t — 1) | dt do 


Since sin @(T — f) is an odd function of c, this reduces to 


=) | f(T) cos w(t — і) іт іо 
2m} J 


which, on noting that the integrand is an even function of @, reduces further to 


fit) =* | do | f(t) cos (1 — 1) dr (8.9) 
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The representation (8.9) is then the required trigonometric form of the Fourier 
integral. 

If f(A is either an odd function or an even function then further simplifications of (8.9) 
are possible. Detailed calculations are left as an exercise for the reader, and we shall 
simply quote the results. 


(a) If f(A) is an even function then (8.9) reduces to 


f(t) = : | | f(T) cos wt cos wt dt do (8.10) 


which is referred to as the Fourier cosine integral. 


(b) Iff(A is an odd function then (8.9) reduces to 


TOE - | | f(t) sin ot sin ot dt do (8.11) 


which is referred to as the Fourier sine integral. 


fo In the case of the Fourier series representation of a periodic function it was a matter 
of some interest to determine how well the first few terms of the expansion represented 
the function. The corresponding problem in the non-periodic case is to investigate how 
well the Fourier integral represents a function when only the components in the lower 
part of the (continuous) frequency range are taken into account. To illustrate, consider 


-l 8 | ^' the rectangular pulse of Figure 8.3 given by 
Figure 8.3 Rectangular 
ulse ] (t| « 1) 
i f=} 
f) - 1l (Hum 1) 0 (> 1) 
: 0 (Jr 7 1)- 


This is clearly an even function, so from (8.10) its Fourier integral is 


Е = 
ҒО = : 1 соѕ wt cos wt dt dw = 2| cos: Qt sin 04 
T] ojo TJ, @ 


An elementary evaluation of this integral 1s not possible, so we consider frequencies 
@ << 0», when 


Р = 


2 99 š 
t sın 
| cos Qt S O dw 


o 


al 


0 


= К D do — | "sin ot — D do 
T e T e 


0 


a(l) Og(I-1) . 
sinu. l sinu, 
u л и 


0 0 


0 


ale 


Figure 8.4 

Plot of (8.12): 

(a) @ = 4; (b) @ = 8; 
(c) @ = 16. 
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The integral 


sic) =| а 
и 


0 
occurs frequently, and it can be shown that 


n. 2n«l 


V S е) 
гре Dn « 1)! 


Its values have been tabulated (see for example L. Rade and B. Westergren, Beta 
Mathematics Handbook, Chartwell-Bratt Ltd, Bromley, Kent, 1990). Thus 


f) 7 Si(exy(t € 1)) — Si(ex (t — 1) (8.12) 


This has been plotted for @) = 4, 8 and 16, and the responses are shown in Figures 8.4(a), 
(b) and (c) respectively. Physically, these responses describe the output of an ideal 
low-pass filter, cutting out all frequencies @ — 0, when the input signal is the rectan- 
gular pulse of Figure 8.3. The reader will no doubt note the similarities with the 
Fourier series discussion of Section 7.2.9 and the continuing existence of the Gibbs 
phenomenon. 








ғ) 
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8.2.2 


The Fourier transform pair 


We note from (8.6) and (8.7) that the Fourier integral (8.8) may be written in the form 
of the pair of equations 


F(jo) = | Fite. de (8.13) 


ТЕ i| F(jo)e" do (8.14) 


F(j@) as defined by (8.13) is called the Fourier transform of f(t), and it provides 
a frequency-domain representation of the non-periodic function f(t), whenever the 
integral in (8.13) exists. Note that we have used the notation F(j@) for the Fourier trans- 
form of f(t) rather than the alternative F(@), which is also in common use. The reason 
for this choice is a consequence of the relationship between the Fourier and Laplace 
transforms, which will emerge later in Section 8.4.1. We stress that this is a choice that 
we have made, but the reader should have no difficulty in using either form, provided 
that once the choice has been made it is then adhered to. Equation (8.14) then provides 
us with a way of reconstructing f(t) if we know its Fourier transform F(jq). 

A word of caution is in order here regarding the scaling factor 1/27 in (8.14). 
Although the convention that we have adopted here is fairly standard, some authors 
associate the factor 1/27 with (8.13) rather than (8.14), while others associate a factor 
of (2x)? with each of (8.13) and (8.14). In all cases the pair combine to give the 
Fourier integral (8.8). We could overcome this possible confusion by measuring the 
frequency in cycles per second or hertz rather than in radians per second, this being 
achieved using the substitution f= @/2n, where fis in hertz апа 0 is in radians per 
second. We have not adopted this approach, since @ is so widely used by engineers. 

In line with our notation for Laplace transforms in Chapter 5, we introduce the 
symbol ¥ to denote the Fourier transform operator. Then from (8.13) the Fourier transform 
F{ f(t)} of a function f(A) is defined by 


Fi f(t)} = F(jo) = | Хе! (8.15) 


whenever the integral exists. Similarly, using (8.14), we define the inverse Fourier 
transform of G(j@) as 


¥ '{G(jo)} = g(t) = E | G(ja)e' йо (8.16) 


whenever the integral exists. The relations (8.15) and (8.16) together constitute the 
Fourier transform pair, and they provide a pathway between the time- and frequency- 
domain representations of a function. Equation (8.15) expresses /(f) in the frequency 
domain, and is analogous to resolving it into harmonic components with a continuously 
varying frequency o. This contrasts with a Fourier series representation of a periodic 
function, where the resolved frequencies take discrete values. 


Example 8.1 


Solution 


Example 8.2 


Solution 


8.2 THE FOURIER TRANSFORM 645 


The conditions for the existence of the Fourier transform F(j0) of the function f(t) 
are Dirichlet's conditions (Theorem 8.1). Corresponding trigonometric forms of the 
Fourier transform pair may be readily written down from (8.9), (8.10) and (8.11). 


Does the function 
fO=1 (< t< o) 


have a Fourier transform representation? 


Since the area under the curve of y = f(t) (~œ < t < œ) is infinite, it follows that 
J=..| f@|dt is unbounded, so the conditions of Theorem 8.1 are not satisfied. We can 
confirm that the Fourier transform does not exist from the definition (8.15). We have 


—° a 
| le?” dt = lim | e" qr 
= ОЭ ео =a 


= lim Бе” - e) 


а-э ЈО 
= lim 2 5in ox 
а оо @ 


Since this last limit does not exist, we conclude that f(t) = 1 (~œ < t < оо) does not 
have a Fourier transform representation. 


It is clear, using integration by parts, that f(t) = t (~œ < t < oe) does not have a 
Fourier transform, nor indeed does f(t) = t" (n > 1, an integer; —ee « t « e). While 
neither e" nor e ^ (a > 0) has a Fourier transform, when we consider the causal signal 
f(t) = H(t)e™ (a > 0), we do obtain a transform. 


Find the Fourier transform of the one-sided exponential function 
АХ) = Не“ (а > 0) 


where f(t) is the Heaviside unit step function. 


The graph of f(t) is shown in Figure 8.5, and we can show that the area under the graph 
is bounded. Hence, by Theorem 8.1, a Fourier transform exists. Using the definition 
(8.15), we have 


SO - | H(t) e "e?" dt. (a 7 0) 


j | georg, | eT 
: a+ jol, 
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Figure 8.5 
The ‘one-sided’ 
exponential function 


SO = HA) e" 


(a > 0). 
Example 8.3 
Solution 
"no 
A 


т O0 rp t 


Figure 8.6 The 
rectangular pulse 


[4 €i «e T) 
=} (o T)- 


Figure 8.7 
A brief table of 
Fourier transforms. 
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f 
1 
О t 
so that 
FHA) e} = — (8.17) 
a * jo 


Calculate the Fourier transform of the rectangular pulse 


Га Qt « T) 
noie (|> Т) 


The graph of f(t) is shown in Figure 8.6, and since the area under it is finite, a Fourier 
transform exists. From the definition (8.15), we have 


T A -jot т 
ET ——e 0 + 0 
Р) = Ае = Jo d 
7 2A o=0 
= 2AT sinc oT 
where sinc x is defined, as in Example 7.22, by 


sin x 





. (x # 0) 
sinc x = 
1 (x = 0) 
f(t) S fq) -| f(t)e ?"' ai 
e*H(t) (a- 0) TE 
te?H() (a- 0) —— 
(a + jo) 
я О Т) 2AT sincoT 
о 
e?! (а> 0) ШО 


а? + а? 
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By direct use of the definition (8.15), we can, as in Examples 8.2 and 8.3, determine 
the Fourier transforms of some standard functions. A brief table of transforms is given 
in Figure 8.7. 


In MATLAB, incorporating the Symbolic Math Toolbox, the Fourier transform 7(j0) 
of f(t) is obtained using the commands 


syms w t 
F-fourier(f(t),t,w) 


whilst the inverse Fourier transform f(t) of F(j@) is obtained using the command 
f-ifourier(F(jw),w,t) 
Corresponding commands in MAPLE are 
with(inttrans): 
ES fourrer EC 
f-invfourier(F(jw),w,t); 
Returning to Example 8.2, and considering the particular case of a — 2, the 
commands 
syms w t 
H-sym('Heaviside(t)'); 
F=fourier (H*exp (-2*t) ) 
in MATLAB return 
к= (а 0л) 
as expected. In MATLAB there is an assume command (as in MAPLE) to enable 
us to specify that a > 0. However, since abs(a) = a for a > 0, the following commands 
in MATLAB can be used to deal with the general case 


syms wta 
H=sym(‘Heaviside(t)’); 
F-fourier(H*exp(-abs(a)*t),t,w) 


As another illustration, consider the function f(t) 2 e ^", a » 0, given in the 
table of Figure 8.7. Considering the particular case a — 2 then the MATLAB 
commands 


syms w t 
F-fourier(exp(-2*abs(t),t,w) 


return 
ped ти 2) 


as specified in the table. It is left as an exercise to consider the general case of a. To 
illustrate the use, in MATLAB, of the i fourier command this transform can be 
inverted using the commands 


syms w t 
[scm enar ena (EA M UAE SUP PEE t) 


which return 
f=Heaviside(t) *exp(-2*t)+exp(2*t) *Heaviside (-t) 


which corresponds to the expected answer f= exp(—2*abs(f)). 
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8.2.3 


Example 8.4 


Solution 


As another illustration consider the Fourier transform F(«) — 1/(а 4 jo given in 
the second entry of the table in Figure 8.7. The MATLAB commands 


syms wt a 
f-ifourier(1/(a-i*w)^2,w,t) 


return 
f-t*exp(-a*t)*Heaviside(t) 
as given in the table. 


Considering the rectangular pulse f(t) of Example 8.3, we first express the pulse 
in terms of Heaviside functions as 


JO =AHt+ T) - H(t- T)) 
and then use the MATLAB commands 


syms w t T A 
H-sym(^Heaviside(t-«T)-Heaviside(t-T)'); 


ВЕ опет (ена) 
Е=ѕітр1е (Е) 


which return 





El2*A*sqn(Im*w)/w 


The continuous Fourier spectra 


From Figure 8.7, it is clear that Fourier transforms are generally complex-valued func- 
tions of the real frequency variable c. If 7 f(r)) 2 F(jo) is the Fourier transform of 
the signal f(t) then F (jæ) is also known as the (complex) frequency spectrum of /(f). 
Writing F'(j@) in the exponential form 


Р(јо) = |Е(јо)Г е!" "99 
plots of |F(j@)| and arg F(ja@), which are both real-valued functions of c, are called the 


amplitude and phase spectra respectively of the signal f(t). These two spectra repres- 
ent the frequency-domain portrait of the signal f(t). In contrast to the situation when 


f(t) was periodic, where (as shown in Section 7.6.3) the amplitude and phase spectra 


were defined only at discrete values of œ, we now see that both spectra are defined for 
all values of the continuous variable œ. 


Determine the amplitude and phase spectra of the causal signal 
АХ) =е“Н() (a> 0) 
and plot their graphs. 


From (8.17), 
FNO} = FG) = 2: 


Thus the amplitude and argument of F (jœ) are 
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: 1 
IO) = ms 
(a +a) 


(8.18) 


arg F(j@) = tan'(1) — tan'(2) = tan(2) (8.19) 


These are the amplitude and phase spectra of f(t), and are plotted in Figure 8.8. 


Figure 8.8 | FC» 
(a) Amplitude and 

(b) phase spectra of the 
one-sided exponential 
function f(t) = e“H(t) 
(a > 0). 


Ма 


о e 





Generally, as we have observed, the Fourier transform and thus the frequency spec- 
trum are complex-valued quantities. In some cases, as for instance in Example 8.3, the 
spectrum is purely real. In Example 8.3 we found that the transform of the pulse illus- 
trated in Figure 8.6 was 


F(j@) = 2AT sinc oT 
where 
sin oT 


sinc OT = oT 
1 (@ = 0) 


(0 # 0) 


is an even function of o, taking both positive and negative values. In this case the 
amplitude and phase spectra are given by 


|F(j@)| = 2AT |sinc @T| (8.20) 
i = 
ariga Teora uy (8.21) 
m (sinc @T < 0) 


with corresponding graphs shown in Figure 8.9. 
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Figure 8.9 БЕ со)! 
(a) Amplitude and 
(b) spectra of the pulse 2AIT. 
A t| ST 
aus l del = T) 
0 (|i 2 T). 
—3n/T -2x/T -n/T О n/T 2n/T Зит © 
(a) 
arg F(jw) 
т 
о e 
(b) 
Figure 8.10 


Frequency spectrum 
(real-valued) of the pulse 
A (l T) 


ло =}, (4 =T 





In fact, when the Fourier transform is a purely real-valued function, we can plot all 
the information on a single frequency spectrum of F(j0) versus c. For the rectangular 
pulse of Figure 8.6 the resulting graph is shown in Figure 8.10. 

From Figure 8.7, we can see that the Fourier transforms discussed so far have 
two properties in common. First, the amplitude spectra are even functions of the 
frequency variable œ. This is always the case when the time signal f(f) is real; that 
is, loosely speaking, a consequence of the fact that we have decomposed, or analysed 
f(0), relative to complex exponentials rather than real-valued sines and cosines. The 
second common feature is that all the amplitude spectra decrease rapidly as @ increases. 
This means that most of the information concerning the ‘shape’ of the signal f(t) 
is contained in a fairly small interval of the frequency axis around œ = 0. From another 
point of view, we see that a device capable of passing signals of frequencies up to 
about œ = 37/T would pass a reasonably accurate version of the rectangular pulse of 
Example 8.3. 


8.2.4 Exercises 
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Whenever possible check your answers using MATLAB or MAPLE. 


Calculate the Fourier transform of the two-sided 
exponential pulse given by 


её“ (250) 
fü) = i0) (a 7 0) 


Determine the Fourier transform of the ‘on—off’ 
pulse shown in Figure 8.11. 





Figure 8.11 The ‘on-off’ pulse. 


A triangular pulse is defined by 


(A/T)t +A 
(-A/T)t +A 


_ (-T <= г «= 0) 
ло =} (0 ct T) 


Sketch f(t) and determine its Fourier transform. 
What is the relationship between this pulse and 
that of Exercise 2? 


Determine the Fourier transforms of 
2K (|t| «€ 2) 
0 (|t| 7 2) 


«oi | < 1) 
0 (f? 1) 


пд =} 


Sketch the function A(t) = f(t) — g(t) and determine 
its Fourier transform. 


Calculate the Fourier transform of the ‘off—on—off’ 
pulse f(£) defined by 


0 (t< -2) 
-1 ўер раа 
fÐ=4 1 тетет) 
a) (1<г=2) 
0 (t2) 


6 


Show that the Fourier transform of 


sinat (|t| < T/a) 


f(t) -| 
0 (| > п/а) 


is 


j2a sin (n 9/a) 


2 2 
Ø -a 


Calculate the Fourier transform of 


f(t) =e™ sin ot HA 


Based on (8.10) and (8.11), define the Fourier sine 
transform as 


R(x) = | F(t) sin xt dt 
0 


and the Fourier cosine transform as 


F(x) = | f(t) cos xt dt 
0 


Show that 
0 (t <0) 
I(t) =jcosat (0<t<a) 
0 (t >a) 


has Fourier cosine transform 


jur tx)a à and ae 
-x 


l+x 


Show that the Fourier sine and cosine transforms of 


0 (t<0) 
К) =41 (0<t<a) 
0 (ta) 
are 
1 — cos xa sin xa 
x x 
respectively. 


Find the sine and cosine transforms of 
f(t) =e H(t) (a> 0). 
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8.3.1 


8.3.2 


Properties of the Fourier transform 


In this section we establish some of the properties of the Fourier transform that allow 
its use as a practical tool in system analysis and design. 


The linearity property 


Linearity is a fundamental property of the Fourier transform, and may be stated 
as follows. 


If f(t) and g(t) are functions having Fourier transforms F(j@) and G(j@) respect- 
ively, and if & and f are constants, then 


Fi of(t) + Bg} = HALO} + BF{g(} = oF (jo) + BG jo) (8.22) 


As a consequence of this, we say that the Fourier transform operator ¥ is a linear 
operator. The proof of this property follows readily from the definition (8.15), since 


Эа) + Вв00)} | lofcr) *- Bec] e?" dr 


°| дог" + В| g(the dt 


- aF(jo) * BG(jo) 


Clearly the linearity property also applies to the inverse transform operator F~. 


Time-differentiation property 
If the function f(t) has a Fourier transform F(jq@) then, by (8.16), 


_ AL. 7 : jot 
Р = zo F(jo)e do 


Differentiating with respect to f gives 


ао ов 
soif g oje 1-1) (jo) F jo) e" do 


—oo 


implying that the time signal df/dt is the inverse Fourier transform of (j@)F(jq@). In 
other words 


lar | cos 
sa - (jo) F(jo) 


Repeating the argument n times, it follows that 


Example 8.5 


Solution 


8.3.3 
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4 | - (joy F(jo) (8.23) 


The result (8.23) is referred to as the time-differentiation property, and may be used 
to obtain frequency-domain representations of differential equations. 


Show that if the time signals y(f) and u(f) have Fourier transforms Y(j@) and U(ja) 
respectively, and if 


2 
dy) 4 34) 4 ay (py = 39M 4 2,0» (8.24) 
dt dt dt 

then Y(j@) = G(j@)U(jq@) for some function G(j@). 


Taking Fourier transforms throughout in (8.24), we have 
2 
sina +320 4 vo) = {stu $ 20) 
dt dt dt 
which, on using the linearity property (8.22), reduces to 
2 
{201 + dud +7 у()} = з |8401 + 2% u(t)} 
dt dt dt 
Then, from (8.23), 
(j@) YGa) + 3(j@)¥( jo) + 7Y(j@) = 3(j@)U(jo) + 2U(jo) 
that is, 
(0° + ј30 + 7)Ү(јо) = (330 + 2)U(jo) 
giving 
Y(j@) = G(ja)U(jo) 
where 


| 2+13 
Gjo) = Ee — 
7-@ +330 


The reader may at this stage be fearing that we are about to propose yet another 
method for solving differential equations. This is not the idea! Rather, we shall show 
that the Fourier transform provides an essential tool for the analysis (and synthesis) of 
linear systems from the viewpoint of the frequency domain. 


Time-shift property 


If a function /(f) has Fourier transform F'(j@) then what is the Fourier transform of the 
shifted version g(t) = f(t — т), where T is a constant? From the definition (8.15), 
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Example 8.6 


Figure 8.12 
Rectangular pulse 
of Example 8.6. 


Solution 


8.3.4 


со 


Fig} -| soea | fa- re?” qr 


—co 


Making the substitution x 2 t — 7, we have 


со 


F{ g(t)} = | f(x) e dx = e | fon e?" dx 2 e" F(io) 


that is, 
Fi f(t — D) = A Fo) (8.25) 


The result (8.25) is known as the time-shift property, and implies that delaying a signal 
by a time 7 causes its Fourier transform to be multiplied by e 7^". 


Since 
|e??*| 2 | cos o — jsin@t|=|,/(cos*@t +sin?@7) | = 1 
we have 


Гезе F(jo)| 2 LFGo)| 
indicating that the amplitude spectrum of f(t — T) is identical with that of f(t). However, 
arg[e ?^' F(jc))] = arg F (jæ) — arge = arg F (jæ) — t 


indicating that each frequency component is shifted by an amount proportional to its 
frequency @. 


Determine the Fourier transform of the rectangular pulse f(t) shown in Figure 8.12. 


fü 


This is just the pulse of Example 8.3 (shown in Figure 8.6), delayed by 7. The pulse of 
Example 8.3 had a Fourier transform 2AT sinc œT, and so, using the shift property 
(8.25) with T= T, we have 


gf(t)) 2 F(jo) 2 e?"" 24T sinc oT — 2AT e?" sinc oT 


Frequency-shift property 


Suppose that a function /(r) has Fourier transform F(j«). Then, from the definition 
(8.15), the Fourier transform of the related function g(t) = e'^"' f(r) is 


Example 8.7 


Solution 


8.3.5 
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со 


Fielt} = | feo" di = | fH? at 


—оо 


B | f(De dr, | where  — 0 — ay 


=F(j@), by definition 
Thus 


FE” AA} = F(j(@— @o)) (8.26) 


The result (8.26) is known as the frequency-shift property, and indicates that multi- 
plication by e®*™ simply shifts the spectrum of f(f) so that it is centred on the point 
@ = @ in the frequency domain. This phenomenon is the mathematical foundation 
for the process of modulation in communication theory, illustrated in Example 8.7. 


Determine the frequency spectrum of the signal g(t) = f(t) cos @,t. 


jo,t 


“+е 


—jo,t 


Since cos @,t = (е 
S (g()) - 9(1f(0)(e + е1) } 
= FUGO e") e 9L eT] 
If Ff f(t)} = F(jo) then, using (8.26), 
F{ f(t) cos @.t} = F{g()} = 5F (j(@— @,)) + Flo + @,)) 


The effect of multiplying the signal (t) by the carrier signal cos @,¢ is thus to produce 
a signal whose spectrum consists of two (scaled) versions of F'(j@), the spectrum of 
f(t): one centred on @ = o), and the other on @ = —@,. The carrier signal cos @,t is said 
to be modulated by the signal f(t). 


) , it follows, using the linearity property (8.22), that 


Demodulation is considered in Exercise 5, Section 8.10, and the ideas of modulation 
and demodulation are developed in Section 8.8. 


The symmetry property 


From the definition of the transform pair (8.15) and (8.16) it is apparent that there is 
some symmetry of structure in relation to the variables ¢ and œ. We can establish the 
exact form of this symmetry as follows. From (8.16), 


f(t) = z| F(jo) e" do 
2n] . 
or, equivalently, by changing the ‘dummy’ variable in the integration, 


2т/() = | Е(ју)е?' йу 
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Example 8.8 


Solution 


so that 
2nf(-t) = | I F(y)e " dy 
or, on replacing ¢ "- 
2nf(-0) = | | Füy)e "" dy (8.27) 


The right-hand side of (8.27) is simply the definition (8.15) of the Fourier transform 
of F(jf), with the integration variable t replaced by y. We therefore conclude that 


FF (jt)} 7 2nf(-o) (8.282) 
given that 
3 f()) - Fo) (8.28b) 


What (8.28) tells us is that if f(t) and F (jæ) form a Fourier transform pair then F (jt) 
апа 27/(—@) also form a Fourier transform pair. This property is referred to as the 
symmetry property of Fourier transforms. It is also sometimes referred to as the 
duality property. 


Determine the Fourier transform of the signal 
C sin at (#0) 


g(t) = Csincat = at (8.29) 
С (7 = 0) 


From Example 8.3, we know that if 


A (tls T) 


8.30 
0 (t| T) pem 


fü) -| 


then 
3 f(t)) 2 F(jo) = 2AT sinc oT 


Thus, by the symmetry property (8.28), F( jf) and 27 f(—@) are also a Fourier transform 
pair. In this case 


F (jt) = 2AT sinc tT 
and so, choosing T = a and A = C/2a to correspond to (8.29), we see that 
F(jt) » C sinc at 2 g(t) 


has Fourier transform 27f/(—@). Rewriting (8.30), we find that, since|@| = |—@|, 


2nC/2a (|@| <a) m (|o| « a) 


FIC si = = 
ide | 0  (e|-»a | 0 (ola) 
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A graph of g(t) and its Fourier transform G(j0) — 21/(—0) 1s shown in Figure 8.13. 


Figure 8.13 

The Fourier 
transform pair 
g(t) and G( ja) of 
Example 8.8. 


G(jw) 
n1 Cla 


-a О а 


(m) Using the MATLAB commands 


зума м са тс 





а) 


к =ОШ ет мс сы (ал) ЕУ: 


F-simple(F) 


returns 


F=C*pi* (-Heaviside (w-a) +Heaviside(w+a) )/a 


which is the answer given in the solution expressed in terms of Heaviside functions. 


8.3.6 Exercises 


Whenever possible check your answers using MATLAB or MAPLE. 


11 Use the linearity property to verify the result in 14 
Exercise 4. 


12  Ify(f)and u(t) are signals with Fourier transforms 


Y(jq@) and U(jq@) respectively, and 15 
2 
Py 4 34 4 (5 = u(t) 
dt dt 
show that Y(j@) = H(j@)U(j@) for some function 
H(jo). What is H(j00)? 16 


13 Use the time-shift property to calculate the Fourier 
transform of the double pulse defined by 
i pede) 


0 (otherwise) 


f(t) =] 


Calculate the Fourier transform of the windowed 
cosine function 


f(t) = cos Wot [H(t + } T) - H(t- }T)] 
Find the Fourier transform of the shifted form of 
the windowed cosine function 

g(t) = cos Wot (H(t) — H(t — T)] 
Calculate the Fourier transform of the windowed 
sine function 


Ad) = sin 2¢(H(t + 1) — H(t- 1)] 
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Emm The frequency response 


8.4.1 


In this section we first consider the relationship between the Fourier and Laplace transforms, 
and then proceed to consider the frequency response in terms of the Fourier transform. 


Relationship between Fourier and Laplace transforms 


The differences between the Fourier and Laplace transforms are quite subtle. At first 
glance it appears that to obtain the Fourier transform from the Laplace transform we 
merely write j@ for s, and that the difference ends there. This is true in some cases, but 
not in all. Strictly, the Fourier and Laplace transforms are distinct, and neither is a 
generalization of the other. 

Writing down the defining integrals, we have 


The Fourier transform 


Fi f(t)} = | fA e dt (8.31) 


The bilateral Laplace transform 


Pift} = | Хе" (8.32) 


The unilateral Laplace transform 


со 


LA f(t)} = | JO e" df (8.33) 


0 


There is an obvious structural similarity between (8.31) and (8.32), while the connec- 
tion with (8.33) is not so clear in view of the lower limit of integration. In the Laplace 
transform definitions recall that s is a complex variable, and may be written as 


=й (8.34) 


where o and o are real variables. We can then interpret (8.31), the Fourier transform of 


f(t), as a special case of (8.32), when o = 0, provided that the Laplace transform exists 


when С = 0, or equivalently when s = jq@ (that is, s describes the imaginary axis in the 
s plane). If we restrict our attention to causal functions, that is functions (or signals) that 
are zero whenever f « 0, the bilateral Laplace transform (8.32) is identical with the 
unilateral Laplace transform (8.33). The Fourier transform can thus be regarded as a 
special case of the unilateral Laplace transform for causal functions, provided again that 
the unilateral Laplace transform exists on the imaginary axis s = ja. 

The next part of the story is concerned with a class of time signals f(t) whose 
Laplace transforms do exist on the imaginary axis s = j@. Recall from (5.71) that a 
causal linear time-invariant system with Laplace transfer function G(s) has an impulse 
response A(t) given by 


Figure 8.14 

Pole locations for 
G(s) and the region 
of existence of 
G(s). 
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Im(s) 
= * Ld 
-е O Re(s) 
h() - £7 (G(s)) - g()H(), | say (8.35) 


Furthermore, if the system is stable then all the poles of G(s) are in the left half-plane, 
implying that g(t) H(t) — 0 as t > ©. Let the pole locations of G(s) be 


Pi Pa ++ +> Pn 


where 
P= -a; + jb, 


in which a,, b, are real and a, # 0 for k — 1, 2,..., n. Examples of such poles are 
illustrated in Figure 8.14, where we have assumed that G(s) is the transfer function of 
a real system so that poles that do not lie on the real axis occur in conjugate pairs. As 
indicated in Section 5.2.3, the Laplace transfer function G(s) will exist in the shaded 
region of Figure 8.14 defined by 


Re(s) > -c? 
where —c? is the abscissa of convergence and is such that 
0 — c? < mina? 


The important conclusion is that for such systems G(s) always exists on the imaginary 
axis s = j@, and so h(t) = g(t) H(t) always has a Fourier transform. In other words, we 
have demonstrated that the impulse response function A(t) of a stable causal, linear 
time-invariant system always has a Fourier transform. Moreover, we have shown that 
this can be found by evaluating the Laplace transform on the imaginary axis; that is, 
by putting s = j@ in the Laplace transform. We have thus established that Fourier 
transforms exist for a significant class of useful signals; this knowledge will be used 
in Section 8.4.2. 


Which of the following causal time-invariant systems have impulse responses that 
possess Fourier transforms? Find the latter when they exist. 


(a) exp + с + 2y(t) = u(t) 
t t 


(b) dza + w'y(t) = u(t) 
t 


(c) б 4 09) + y(t) = 2u(t) + mS 
dt dt dt 
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Solution 


8.4.2 


Assuming that the systems are initially in a quiescent state when ¢ < 0, taking Laplace 
transforms gives 





(а) ув) = + —U(s) = G(s) U(s) 
5 + 35+ 2 

(b) ¥(s) = <1 U(s) = G(s) U(s) 
S +@ 


() Y()- 4 M——u(s) - G.G)U(s) 
s +s+l 


In case (a) the poles of G,(s) are at s 2 —1 and s — —2, so the system is stable and the 
impulse response has a Fourier transform given by 


: 1 1 
G (jo) = -| = 


5'+35+2|_„ 2-9 +j3o 


jo 
ld c o - jo .Q - o) - jo 
(2- 0) +90 ow +50 +4 
In case (b) we find that the poles of G,(s) are at s = j@ and s = —jq@; that is, on the 
imaginary axis. The system is not stable (notice that the impulse response does not 
decay to zero), and the impulse response does not possess a Fourier transform. 
In case (c) the poles of G,(s) are at s= —} + j},/3 and s= —} — j1/3. Since these are 
in the left half-plane, Re(s) < 0, we conclude that the system is stable. The Fourier 
transform of the impulse response is then 


: 2+ j@ 
G;(jo) = ——— 
1- @ +јо 


The frequency response 


For a linear time-invariant system, initially in a quiescent state, having a Laplace transfer 
function G(s), the response y(f) to an input u(f) is given in (5.66) as 


Y(s) = G(s)U(s) (8.36) 


where Y(s) and U(s) are the Laplace transforms of y(t) and u(t) respectively. In 
Section 5.8 we saw that, subject to the system being stable, the steady-state response 
y(t) to a sinusoidal input u(t) = A sin wt is given by (5.101) as 


V(t) = A|G(j@)| sin[@t + arg G(j0)] (8.37) 


That is, the steady-state response is also sinusoidal, with the same frequency as the 
input signal but having an amplitude gain |G(jq@)| and a phase shift arg G(ja@). 
More generally, we could have taken the input to be the complex sinusoidal signal 


u(t) = A ei” 
and, subject to the stability requirement, showed that the steady-state response is 

Ys (À = AGC jo) ei” (8.38) 
or 


Vat) = A| G(j@)| ers (8.39) 
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Solution 
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As before, |G(jo)| and argG(jo) are called the amplitude gain and phase shift 
respectively. Both are functions of the real frequency variable œ, and their plots versus 
O constitute the system frequency response, which, as we saw in Section 5.8, charac- 
terizes the behaviour of the system. Note that taking imaginary parts throughout in (8.39) 
leads to the sinusoidal response (8.37). 

We note that the steady-state response (8.38) is simply the input signal Ae/^' multi- 
plied by the Fourier transform G(j«) of the system's impulse response. Consequently 
G(jq) is called the frequency transfer function of the system. Therefore if the system 
represented in (8.36) is stable, so that G(j@) exists as the Fourier transform of its 
impulse response, and the input u(r) = S£! (U(s)) has a Fourier transform U(j0), then 
we may represent the system in terms of the frequency transfer function as 


Y(jo) 2 G(jo)U( jo) (8.40) 


Equation (8.40) thus determines the Fourier transform of the system output, and can 
be used to determine the frequency spectrum of the output from that of the input. This 
means that both the amplitude and phase spectra of the output are available, since 


IY(jø@)| = |GCjo)] |UCjo)] (8.41a) 
arg Y(j@) = arg G( jo) + arg U( jo) (8.41b) 


We shall now consider an example that will draw together both these and some earlier ideas 
which serve to illustrate the relevance of this material in the communications industry. 


A signal f(t) consists of two components: 


(a) asymmetric rectangular pulse of duration 2x (see Example 8.3) and 
(b) second pulse, also of duration 2m (that is, a copy of (a)), modulating a signal with 
carrier frequency @, = 3 (the process of modulation was introduced in Section 8.3.4). 


Write down an expression for f(t) and illustrate its amplitude spectrum. Describe the 
amplitude spectrum of the output signal if f(t) is applied to a stable causal system with 
a Laplace transfer function 


1 


ас 
8 + 25 + 1 


Denoting the pulse of Example 8.3, with 7 = л, бу P(ft), and noting the use of the term 
‘carrier signal’ in Example 8.7, we have 


S(O) = PO) + (Cos 30) P, (0) 
From Example 8.3, 
F{P.(t)} = 2n sinc on 
so, using the result of Example 8.7, we have 
F fÀ} = F (jo) = 2n sinc on * 1 [2n sinc(o — 3)n + 20 sinc(@ + 3)л] 


The corresponding amplitude spectrum obtained by plotting |/(jq@)| versus @ is illus- 
trated in Figure 8.15. 
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Figure 8.15 ЖЕЛ 
Amplitude spectrum 
of the signal a 
P(t) + (cos 3t)P(t). 
-6-5-4-3-2-10 12 34 5 6 9 


Since the system with transfer function 


I 


OOS ara UT 
s+ /2s+ 1 


is stable and causal, it has a frequency transfer function 


1 


Gija) = —— ~ 
1-0 + j\2@ 


so that its amplitude gain is 


1 


G(j@)| = 

|G(jo) | a+) 
The amplitude spectrum of the output signal | Y(j@)| when the input is f(A) is then 
obtained from (8.4 1a) as the product of | F(j@)| and |G(j@)|. Plots of both the amplitude 
gain spectrum |G(jq@)| and the output amplitude spectrum | Y(j@)| are shown in Figures 
8.16(a) and (b) respectively. Note from Figure 8.16(b) that we have a reasonably good 
copy of the amplitude spectrum of P,(f) (see Figure 8.9 with A = n, T= 1). However, 
the second element of f(t) has effectively vanished. Our system has ‘filtered out’ this 
latter component while ‘passing’ an almost intact version of the first. Examination of 
the time-domain response would show that the first component does in fact experience 
some ‘smoothing’, which, roughly speaking, consists of rounding of the sharp edges. 
The system considered here is a second-order ‘low-pass’ Butterworth filter (introduced 
in Section 6.10.1). 


l Y(jæ)l 
2n 





-6 -5 -4 -3 2 -10 1 23456 €? -6-5-4-3-2-10 12 3.4 5.6 Y 
(a) (b) 


Figure 8.16 (a) Amplitude gain spectrum of the system with G(s) = 1/(s? + /2.s + 1); (b) amplitude spectrum of the output 
signal | Y(j@)| of Example 8.10. 
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8.4.3 Exercises 


Find the impulse response of systems (a) and (c) 
of Example 8.9. Calculate the Fourier transform 

of each using the definition (8.15), and verify the 
results given in Example 8.9. 


Use the time-shift property to calculate the Fourier 
transform of the double rectangular pulse f(t) 
illustrated in Figure 8.17. 


20 
fo 


A 





t © t m 
E —po- 2 ri 
t-T -: —4T t-T о т t 21 


Figure 8.17 The double rectangular pulse of 
Exercise 18. 


The system with transfer function 


1 
G(s) = = 
5 + {25+ 1 


was discussed in Example 8.10. Make a 
transformation 
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nae 
s 
and write down G(s’). Examine the frequency 
response of a system with transfer function G(s’) 
and in particular find the amplitude response 
when @= 0 and as @ > o-. How would you 
describe such a system? 


Use the symmetry property, and the result of 
Exercise 1, to calculate the Fourier transform of 


O=- 


a +t 





Sketch f(t) and its transform (which is real). 


Using the results of Examples 8.3 and 8.7, calculate 

the Fourier transform of the pulse-modulated signal 
(t) = Р, (0) cos Wot 

where 


1 (t=T) 


P=] 
0 (t| — T) 


is the pulse of duration 27. 


Transforms of the step and impulse functions 


In this section we consider the application of Fourier transforms to the concepts of 
energy, power and convolution. In so doing, we shall introduce the Fourier transform 
of the Heaviside unit step function H(r) and the impulse function ô(®). 


8.5.1 Energy and power 


In Section 7.6.4 we introduced the concept of the power spectrum of a periodic signal 

and found that it enabled us to deduce useful information relating to the latter. In this 

section we define two quantities associated with time signals f(t), defined for —ee « 1 « eo, 

namely signal energy and signal power. Not only are these important quantities in them- 

selves, but, as we shall see, they play an important role in characterizing signal types. 
The total energy associated with the signal f(f) 1s defined as 


Е = | LOT dt 


(8.42) 
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Solution 


If f(t) has a Fourier transform F( jo), so that, from (8.16), 
ЛО = 5 | Fjo) do 
27 E 


then (8.42) may be expressed as 


E- | fi ft) dt = | va Powe 


—oo 


On changing the order of integration, this becomes 


Es m | ү каш] | | fit) ea do (8.43) 


From the defining integral (8.15) for F(j@), we recognize the part of the integrand 
within the square brackets as F(—j@), which, if f(1) 1s real, is such that F(-j@) = 
F*(jo), where F*(j0) is the complex conjugate of F'(j@). Thus (8.43) becomes 


E- ij F(jo)F*(jo) do 
2n E 


so that 


к= | гдр ағ = i| (јо) > ао (8.44) 


Equation (8.44) relates the total energy of the signal f(/) to the integral over all fre- 
quencies of |F(jq@)|’. For this reason, |F'(j@)|’ is called the energy spectral density, 
and a plot of |F (jœ)? versus æ is called the energy spectrum of the signal f(t). The 
result (8.44) is called Parseval’s theorem, and is an extension of the result contained 
in Theorem 7.6 for periodic signals. 


Determine the energy spectral densities of 


(a) the one-sided exponential function f(t) = e“ H(t) (a > 0), 
(b) the rectangular pulse of Figure 8.6. 


(a) From (8.17), the Fourier transform of f(f) 1s 
Fjo) = 5—15 

а +0 
The energy spectral density of the function is therefore 


|F(jo) = F(j@)F*(jo) = а 
toa to 


that is, 


8.5 TRANSFORMS OF THE STEP AND IMPULSE FUNCTIONS 665 


1 


2 2 
a +o 


IF(jø) f = 





(b) From Example 8.3, the Fourier transform F'(jq@) of the rectangular pulse is 
F(j@) = 2AT sinc oT 
Thus the energy spectral density of the pulse is 


|F (jo) =44°7 sine’ oT 


There are important signals f(t), defined in general for —co < t < œ, for which the 
integral f. [ f(/ dt in (8.42) either is unbounded (that is, it becomes infinite) or does 
not converge to a finite limit; for example, sin f. For such signals, instead of considering 
energy, we consider the average power P, frequently referred to as the power of the 
signal. This is defined by 


T/2 
P - liml | LOT dt (8.45) 
Т-°о ИЕ 


Zn 


Note that for signals that satisfy the Dirichlet conditions (Theorem 8.1) the integral 
in (8.42) exists and, since in (8.45) we divide by the signal duration, it follows that 
such signals have zero power associated with them. 

We now pose the question: ‘Are there other signals which possess Fourier transforms?’ 
As you may expect, the answer is ‘Yes’, although the manner of obtaining the transforms 
will be different from our procedure so far. We shall see that the transforms so obtained, 
on using the inversion integral (8.16), yield some very ‘ordinary’ signals so far excluded 
from our discussion. 

We begin by considering the Fourier transform of the generalized function 6(A), the 
Dirac delta function introduced in Section 5.5.8. Recall from (5.49) that ó(f) satisfies 
the sifting property; that is, for a continuous function g(t), 


| g(t)6(t — c) dt = e (a € c « b) 


0 otherwise 


a 


Using the defining integral (8.15), we readily obtain the following two Fourier 
transforms: 


FISC} = | ó(r)e " qr 21 (8.46) 


F{5(t — ty) }= | ó(t — t)e "" = е!" (8.47) 


These two transforms are, by now, unremarkable, and, noting that | е? “ol = 1, we illustrate 
the signals and their spectra in Figure 8.18. 
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Figure 8.18 

(а) (7) and its 
amplitude spectrum; 
(b) ó(t — t) and its 
amplitude spectrum. 


Ó(t) IF (jea)l 
— 
О t О W 
(a) 
Òl — to) М јо) 
— 
O h t о e 
(b) 


These results may be confirmed in MATLAB. Using the commands 


syms w t 
Пул 0) е (Е) Na 
F-fourier(D,t,w) 


returns 





CET 
in agreement with (8.46); whilst the commands 


Sime Ww tt. a 
DIS Som Tere cce = ean) 5 
Fi1-fourier(D1,t,w) 


return 





Fl=exp (-i*T*w) 
which confirms (8.47), with T replacing tọ. 
Likewise in MAPLE the commands 


Dyer ista erre indere TR S 
fourier(Dirac(t),t,w); 


return the answer 1. 


We now depart from the definition of the Fourier transform given in (8.15) and seek 
new transform pairs based on (8.46) and (8.47). Using the symmetry (duality) property 
of Section 8.3.5, we deduce from (8.46) that 


1 and 2mó(-0)-2nó(o) (8.48) 
is another Fourier transform pair. Likewise, from (8.47), we deduce that 

eJ" and 2mó(-0- f) 
is also a Fourier transform pair. Substituting tj = —@, into the latter, we have 

e" and = 2n6(@) — @) = 2тё(@— 0.) (8.49) 


as another Fourier transform pair. | 

We are thus claiming that in (8.48) and (8.49) that f{(f) = 1 and f(t) = e’”, which 
do not have ‘ordinary’ Fourier transforms as defined by (8.15), actually do have 
‘generalized’ Fourier transforms given by 


F (ja) = 2nó(o) (8.50) 


8.5 TRANSFORMS OF THE STEP AND IMPULSE FUNCTIONS 667 


F,(j@) = 2n6(@ — @)) (8.51) 


respectively. 

The term ‘generalized’ has been used because the two transforms contain the gener- 
alized functions 6(@) and 6(@ — @ ). Let us now test our conjecture that (8.50) and (8.51) 
are Fourier transforms of /,(f) and f,(4) respectively. If (8.50) and (8.51) really are Fourier 
transforms then their time-domain images f,(f) and f,(f) respectively should reappear via 
the inverse transform (8.16). Substituting F',(jq@) from (8.50) into (8.16), we have 


со 


suae =) постао = E | 2n6()e dw =1 


—oo 


so f,(f) = 1 is recovered. 
Similarly, using (8.51), we have 


F "{F,(jo)} = ij 2nó(o — o) e" do - e/^' 


so that (A) = e°” is also recovered. 


Our approach has therefore been successful, and we do indeed have a way of gener- 
ating new pairs of transforms. We shall therefore use the approach to find generalized 
Fourier transforms for the signals 


f«(t) 2 cos ot, АО) = sin Wot 
Since 
f(t) = cos Mot = ie 4e 7v) 
the linearity property (8.22) gives 
Fl fy} = LF Lely + LLM} 
which, on using (8.49), leads to the generalized Fourier transform pair 
1с05 00} = п[8(0 – 0) + (0 + 0,)] (8.52) 
Likewise, we deduce the generalized Fourier transform pair 
іп 000 = јл[8(0 + 0) – б(@ — 0)] (8.53) 


The development of (8.53) and the verification that both (8.52) and (8.53) invert 
correctly using the inverse transform (8.16) is left as an exercise for the reader. 

It is worth noting at this stage that defining the Fourier transform 9 f(t)) of f(t) 
in (8.15) as 


Fi f(t} = | Аде dt 


whenever the integral exists does not preclude the existence of other Fourier transforms, 
such as the generalized one just introduced, defined by other means. 
It is clear that the total energy 


E -| COS Wot dt 
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Example 8.13 


Solution 


associated with the signal f,(f) = cos @of is unbounded. However, from (8.45), we can 
calculate the power associated with the signal as 


Т/2 T/2 
P -diml COs Wot dt = lim Ligula 20, =; 
To Te T 2009 


-T2 -т/2 


Thus, while the signal f,(t) = cos@t has unbounded energy associated with it, its 
power content is }. Signals whose associated energy is finite, for example /(0) = е "H(f) 
(а > 0), are sometimes called energy signals, while those whose associated energy is 
unbounded but whose total power is finite are known as power signals. The concepts 
of power signals and power spectral density are important in the analysis of random 
signals, and the interested reader should consult specialized texts. 


Suppose that a periodic function f(t), defined on —co < t < œ, may be expanded in a 
Fourier series having exponential form 


f= > Fev 


What is the (generalized) Fourier transform of f(t)? 


From the definition, 

gU) - s YF e = DAF} 
which, on using (8.49), gives 

ФО} = Y F,2n6(@ — n@p) 
That is, 

5 До} = 2л} F,5( — non) 


п=—оо 


у/һеге ЕЁ, (—°° < n < œ) are the coefficients of the exponential form of the Fourier series 
representation of f(t). 


Use the result of Example 8.12 to verify the Fourier transform of f(A = cos @of given in (8.52). 


Since 


j@ot IO 


f() 2 cos aut 2 18" + е 
the F, of Example 8.12 are 
F,-F- i 


Е,=0 (n+l) 
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Thus, using the result 


5 Дю} = 2л У, F,(w ~ a) 
we have 
F{COS Wott = 2xF ,ó(0 - ax) - 2t F,ó(0 — 0x) 
=7[0(@+ @) + (0 – @)] 


in agreement with (8.52). 


[=] Confirm this answer using the MATLAB commands 


Буте мс а 
ЕЕ Еол (Cos (AE) д) 


where a has been used to represent 0. 


Example 8.14 Determine the (generalized) Fourier transform of the periodic ‘sawtooth’ function, 
defined by 


=F (0 <1<2T7) 


fit +27) =f) 


Solution In Example 7.19 we saw that the exponential form of the Fourier series representation 


of f(t) is 
Л) = Y, Fe" 
with 
ius c.m 
2T T 
Ed 


Е,= 12 (n#0) 
пт 
It follows from Example 8.12 that the Fourier transform ¥{ f(t)) is 


FIAN} = Fo) - 4nà(o) * Y j28(w— ney) 


n#0 


= 4тб(ф) + j4 У Lo(o 2 zz) 
n 


n-—oo 


nz 
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Figure 8.19 
Unit impulse train 


Л) = 3... Olt- nT). 


Solution 


Thus we see that the amplitude spectrum simply consists of pulses located at integer 
multiples of the fundamental frequency @, =1/T. The discrete line spectra obtained via 
the exponential form of the Fourier series for this periodic function is thus reproduced, 
now with a scaling factor of 27. 


Determine the (generalized) Fourier transform of the unit impulse train f(f) 2 35... Ó(t—nT) 
shown symbolically in Figure 8.19. 


fo 
1 


O T 2T 3T AT 5T t 


Although f(t) is a generalized function, and not a function in the ordinary sense, it 
follows that since 


f(t + kT) = У ó(t * (k— n)T) (kan integer) 


n-—oo 


Y ôt- mT) (т=п – №) 


- ft) 
it is periodic, with period 7. Moreover, we can formally expand /(f) as a Fourier series 


fo Y Re (o=) 


with 


T/2 T/2 i 1 
-jno ragt 
rat] e var= 3 Sle” dicc for all n 


-T/2 
It follows from Example 8.12 that 
-] © 
Д} = m 7900 = по) = ®% уор = по) 
Thus we have shown that 
F $u- nD] = aÑ So- no (8.54) 


where @, = 27/T. That is, the time-domain impulse train has another impulse train as 
its transform. We shall see in Section 8.6.4 that this result is of particular importance in 
dealing with sampled time signals. 
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Following our successful hunt for generalized Fourier transforms, we are led to con- 
sider the possibility that the Heaviside unit step function H(t) defined in Section 5.5.1 
may have a transform in this sense. Recall from (5.56) that if 


SO = HA 
then 
df(t) _ 
dt 2 


From the time-differentiation property (8.23), we might expect that 1f 
FHA} = Hja) 
then 
GDO) = AA} = 1 (8.55) 


Equation (8.55) suggests that a candidate for H(j@) might be 1/jæ@, but this is not the 
case, since inversion using (8.16) does not give H(t) back. Using (8.16) and complex 
variable techniques, it can be shown that 


(t > 0) 


1 

= ; 2 

zi. 3 1 е!“ 

ge 1 — = — ——d = = = 1 Í 

Jal 1 pm @ 0 (t=0) 5591 (ї) 
Ri (t « 0) 


1 
2 


where sgn(f) is the signum function, defined by 


1 (t>0) 
sen(t)= 5 0 (t=0) 
-1 (t<0) 


[=] (Note: This last result may be obtained in terms of Heaviside functions using the 
MATLAB commands 


syms w t 
f-ifourier(1/(i*w)) 


or using the MAPLE commands 


with(inttrans): 
invfourier(1(I*w),w,t); 


However, we note that (8.55) is also satisfied by 
Ajo) = + + có(o) (8.56) 
jo 


where c is a constant. This follows from the equivalence property (see Definition 5.2, 
Section 5.5.11) f(@)6(@) = f(0)6(@) with f(@) = jæ, which gives 


(j@)H(jo) = 1 + (j@)c5(@) = 1 
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Inverting (8.56) using (8.16), we have 


_ gj = A j 1. jet 
g(t)2 F F + LI ij lis + 0) е ао 
c/2n * i (t0) 
—4c/2n (t 2 0) 
c/2rn — i (t<0) 


2 


and, choosing c = n, we have 


1 (t>0) 
g(t) =4} (t=0) 
0 (t<0) 


Thus we have (almost) recovered the step function H(#). Here g(t) takes the value } at 
t = 0, but this is not surprising in view of the convergence of the Fourier integral at 
points of discontinuity as given in Theorem 8.1. With this proviso, we have shown that 


H(jo) = #{Н()} = Г + n(o) (8.57) 


We must confess to having made an informed guess as to what additional term to add in 
(8.56) to produce the Fourier transform (8.57). We could instead have chosen có(kc) 
with k a constant as an additional term. While it is possible to show that this would not lead 
to a different result, proving uniqueness is not trivial and is beyond the scope of this book. 


Using the MATLAB commands 


syms w t 
H-sym(^Heaviside(t)"'); 
F-fourier (h,t,w) 


returns 





F-pi*Dirac(w)-i/w 
which, noting that — — 1/i, confirms result (8.57). 
The same result is obtained in MAPLE using the commands. 
with(inttrans): 
fourier (Heaviside(t),t,w); 


Likewise the MATLAB commands 


syms w t T 
H-sym('Heaviside(t-T)'); 
F-fourier(H,t,w) 


return 





F-exp(-i*T*w)*(pi*Dirac(w)-i/w) 
which gives us another Fourier transform 
S'tH(t — T)) 2 eJ"(xó(o) + 1/ja) 


8.5.2 
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Convolution 


In Section 5.6.6 we saw that the convolution integral, in conjunction with the Laplace 
transform, provided a useful tool for discussing the nature of the solution of a differ- 
ential equation, although it was not perhaps the most efficient way of evaluating the 
solution to a particular problem. As the reader may now have come to expect, in view 
of the duality between time and frequency domains, there are two convolution results 
involving the Fourier transform. 


Convolution in time 


Suppose that 


F{u(t)} = U(jo) =| u(t) e)" dt 


F{v(t)} = V(ja) -Í v(f) e?" dt 


then the Fourier transform of the convolution 


y(t) = | u(T)v(t — T) dt = u(t) * v(t) (8.58) 


F{y(t)} = Y(jo) = | pu | и(т)о( — па dt 


= | и(т) | eJ" 'y(r— ов) ат 


Introducing the change of variables z — t — t, 1 — T and following the procedure for 
change of variable from Section 5.6.6, the transform can be expressed as 


Y(jo) = | u(T) | | eoe is e 


- | и(т)е 1° af v(z) e j^ dz 


so that 

Y(jo) 2 U(j0)V(j00) (8.59) 
That is, 

Fiut) *w(t)) — Ztw(t)*u(r)) 2 U(j09V(jo) (8.60) 


indicating that a convolution in the time domain is transformed into a product in the 
frequency domain. 
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Example 8.16 


Convolution in frequency 


If 


оо 


91и(0) = U(jo), мі о) U(jo) el" dc 


d (w(t)) 2 V(jo), with v(t) = i| V(jo)el"'do 
then the inverse transform of the convolution 

U(jo) * V(jo) = ү U(jy) V(j(@ — y)) dy 
is given by 


F '{U(j@) * V(jo)} = ij e| UC jy)VCj(o — 0%) do 


I | ean | Hoye" aa] dy 


~ 2n 
A change of variable z — 0 — y, o — 0 leads to 


muori = Д) ио ооа 


| | UC jy) e? dy | V(jz) e"' dz 


ШЕ 
- 2n u(t)w(t) 
That is, 
{дь} = 2- U(jo) + Vja) (8.61) 


and thus multiplication in the time domain corresponds to convolution in the frequency 
domain (subject to the scaling factor 1/(27)). 


Suppose that f(t) has a Fourier transform F'(jq@). Find an expression for the Fourier 
transform of g(f), where 


g(t) = | S(t) dt 


24 


25 
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Solution Since 


ne-o] 


we can write 


1 (tS?) 
0 (Tf) 
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g(t) -| f()H (t - 17) dv — f(t) * H(t) 


the convolution of g(f) and H(t). Then, using (8.60), 
FLA; = Co) = F(j@)H(jo) 


which, on using the expression for H(j@) from (8.57), gives 


G(jo) - un + тЕ(}®)б(®) 


so that 


G(jo) = з 4 nF(0)ó(0) 


8.5.3 Exercises 


Verify that ¥'{n[6(@— @)) + 6(@+ @y)]} 
= COS Wo. 


Show that Z(sin Wot} = jm[6(@+ @)) — 6(@— @y)]. 
Use (8.16) to verify that 

F"'{jn[5(@+ Wo) — 5(@— @)]} = sin Wot 
Suppose that f(t) and g(t) have Fourier transforms 
F(jo) and G(jo) respectively, defined in the 


‘ordinary’ sense (that is, using (8.15)), and 
show that 


J S(t)G(jt) dt = | F(jt)g(1) dt 
This result is known as Parseval's formula. 


Use the results of Exercise 24 and the symmetry 
property to show that 


(8.62) 


| дого = | F(jo)G(-jo) do 


26 Use the convolution result in the frequency domain 
to obtain Z( H(f) ѕіп 07}. 


27 Calculate the exponential form of the Fourier series 
for the periodic pulse train shown in Figure 8.20. 
Hence show that 


FF f(t)} = may sinc 


n=- 


(2 асо – по) 


(@ = 2n/T), and A is the height of the pulse. 


foe 
<d> 


= 


2T 3T 


z 
t 


Figure 8.20 Periodic pulse train of Exercise 27. 
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Ind The Fourier transform in discrete time 


8.6.1 


8.6.2 


Introduction 


The earlier sections of this chapter have discussed the Fourier transform of signals 
defined as functions of the continuous-time variable t. We have seen that a major area 
of application is in the analysis of signals in the frequency domain, leading to the con- 
cept of the frequency response of a linear system. In Chapter 7 we considered signals 
defined at discrete-time instants, together with linear systems modelled by difference 
equations. There we found that in system analysis the z transform plays a role similar 
to that of the Laplace transform for continuous-time systems. We now attempt to 
develop a theory of Fourier analysis to complement that for continuous-time systems, 
and then consider the problem of estimating the continuous-time Fourier transform in 
a form suitable for computer execution. 


A Fourier transform for sequences 


First we return to our work on Fourier series and write down the exponential form of 
the Fourier series representation for the periodic function F(e/^) of period 2x. Writing 
0 — ot, we infer from (7.57) and (7.61) that 


F(e^) - Y ve (8.63) 
where 
gs = | F(e^e "^ qg (8.64) 


Thus the operation has generated a sequence of numbers ( f,) from the periodic func- 
tion F(e/9) of the continuous variable 0. Let us reverse the process and imagine that we 
start with a sequence {g,} and use (8.63) to define a periodic function G"(e?) such that 


Ge’) = Ys e”? (8.65) 


We have thus defined a transformation from the sequence {g,} to G'(ei?). This trans- 
formation can be inverted, since, from (8.64), 


ios P | Ge) ei" do (8.66) 


-T 


and we recover the terms of the sequence (g,! from G"(e?). 
It is convenient for our later work if we modify the definition slightly, defining the 
Fourier transform of the sequence (g,) as 


Fig} = Ge") = У PCS (8.67) 


п=—оо 


Example 8.17 


Solution 


Figure 8.21 
Transform of 
the sequence of 
Example 8.17. 
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whenever the series converges. The inverse transform is then given from (8.66), by 


ae = | С(е!®)е!' аө (8.68) 


The results (8.67) and (8.68) thus constitute the Fourier transform pair for the sequence 
{g,}. Note that G(e!*) is a function of the continuous variable 0, and since it is a func- 
tion of e? it is periodic (with a period of at most 27), irrespective of whether or not the 
sequence (g,) is periodic. 

Note that we have adopted the notation G(e/?) rather than G(0) for the Fourier transform, 
similar to our use of F(jo) rather than F(q@) in the case of continuous-time signals. In 


the present case we shall be concerned with the relationship with the z transform of 
Chapter 6, where z = r e}, and the significance of our choice will soon emerge. 


Find the transform of the sequence {g;}"_,, where g) =2, 2,=g,=1andg,=g,=0 
for k z 0 or 2. 


From the definition (8.67), 


jg) 2 Ge*)- V ge" 


=e cP? 4 elt ge =e 424679 
= 2(1 + cos 20) = 4cos’@ 


In this particular case the transform is periodic of period m, rather than 2r. This is 
because g, = g , = 0, so that cos @ does not appear in the transform. Since G(e?) is 
purely real, we may plot the transform as in Figure 8.21. 


G(el?) 2 4 cos?8 








Having defined a Fourier transform for sequences, we now wish to link it to 
the frequency response of discrete-time systems. In Section 8.4.2 the link between 
frequency responses and the Fourier transforms of continuous-time systems was estab- 
lished using the Laplace transform. We suspect therefore that the z transform should 
yield the necessary link for discrete-time systems. Indeed, the argument follows closely 
that of Section 8.4.2. 
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For a causal linear time-invariant discrete-time system with z transfer function G(z) 
the relationship between the input sequence (u,) and output sequence ( y,) in the trans- 
form domain is given from Section 6.6.1 by 


¥(z) = G(z)U(z) (8.69) 


where U(z) = Z(u,) and Y(z) = Z{y,}. 
To investigate the system frequency response, we seek the output sequence corre- 
sponding to an input sequence 


{u,} = {Aci} = {Aci Ө=@Т (8.70) 


which represents samples drawn, at equal intervals 7, from the continuous-time com- 
plex sinusoidal signal e^", 

The frequency response of the discrete-time system is then its steady-state response 
to the sequence {u,} given in (8.70). As for the continuous-time case (Section 8.4.2), 
the complex form e/^' is used in order to simplify the algebra, and the steady-state 
sinusoidal response is easily recovered by taking imaginary parts, if necessary. 

From Figure 6.3, we saw that 


&{Ае'#ё = FLA = 
10 
2-е 
so, from (8.69), the response of the system to the input sequence (8.70) is determined by 
Az 
je 


2-е 


Ү(2) = С(2) 





(8.71) 


Taking the system to be of order n, and under the assumption that the n poles р, 
(r=1,2,..., n) of G(z) are distinct and none is equal to e}, we can expand Y(z)/z in 
terms of partial fractions to give 


Y(z2 | c oe У C, (8.72) 
2 2-е n^ P 
where, in general, the constants c, (r 2 1, 2,..., n) are complex. Taking inverse z trans- 


forms throughout in (8.72) then gives the response sequence as 


-1 = -l 7 -1 ZC, 
{yo =E {Ү(2)) = £ „+в | 


r=1 


that is, 
а} = се!) + У сри} (8.73) 
r=1 


If the transfer function G(z) corresponds to a stable discrete-time system then all its 
poles p, (r 2 1, 2,..., n) he within the unit circle |z | « 1, so that all the terms under the 
summation sign in (8.73) tend to zero as k — ce. This is clearly seen by expressing 
p, in the огт р, = |р, | е! and noting that if |p,| « 1 then |p,|' — 0 аз К — ee. Con- 
sequently, for stable systems the steady-state response corresponding to (8.73) is 


{yet = cte") 


Using the ‘cover-up’ rule for partial fractions, the constant c is readily determined from 
(8.71) as 


Example 8.18 


Figure 8.22 
Discrete-time system 
of Example 8.18. 
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с= АС(е!) 
so that the steady-state response becomes 
D.) 7 AG(e9) (e?) (8.74) 


We have assumed that the poles of G(z) are distinct in order to simplify the algebra. 
Extending the development to accommodate multiple poles is readily accomplished, 
leading to the same steady-state response as given in (8.74). 

The result (8.74) corresponds to (8.38) for continuous-time systems, and indicates 
that the steady-state response sequence is simply the input sequence with each term 
multiplied by G(e^). Consequently G(e!”) is called the frequency transfer function of 
the discrete-time system and, as for the continuous case, it characterizes the system's 
frequency response. Clearly G(e/?^) is simply G(z), the z transfer function, with z = е), 
and so we are simply evaluating the z transfer function around the unit circle |z| = 1. 
The z transfer function G(z) will exist on |z| = 1 if and only if the system is stable, 
and thus the result is the exact analogue of the result for continuous-time systems in 
Section 8.4.2, where the Laplace transfer function was evaluated along the imaginary 
axis to yield the frequency response of a stable linear continuous-time system. 

To complete the analogy with continuous-time systems, we need one further result. 
From Section 6.6.2, the impulse response of the linear causal discrete-time system with 
z transfer function G(z) is 


{Vest = F{G@)} = {grhkico, say 
Taking inverse transforms then gives 
бш =} в = Yaz" 
k-0 k=- 
since g, = 0 (k < 0) for a causal system. Thus 
G(ei?) = у Re 
К=—оо 


and we conclude from (8.67) that G(e!”) is simply the Fourier transform of the sequence 
{g,}. Therefore the discrete-time frequency transfer function G(e!”) is the Fourier trans- 
form of the impulse response sequence. 


Determine the frequency transfer function of the causal discrete-time system shown in 
Figure 8.22 and plot its amplitude spectrum. 
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Solution 


Figure 8.23 


Amplitude spectrum 
of the system of 


Example 8.18. 


8.6.3 


Using the methods of Section 6.6.1, we readily obtain the z transfer function as 


22+ 1 
2? + 0.752 + 0.125 
Next we check for system stability. Since z?-- 0.75z -- 0.125 — (z + 0.5)(z + 0.25), the 
poles of G(z) are at p, 2 —0.5 and p, = —0.25, and since both are inside the unit circle 


|z| = 1, the system is stable. The frequency transfer function may then be obtained as 
G(e/?), where 


G(z) = 


. je 
G(e”) - = 2 e — 1 
е" + 0.75е” + 0.125 


To determine the amplitude spectrum, we evaluate | G(e!”)| as 


| 10 
| Ge*)| = 2e +1 


[e7* 40758? 4- 0.125] 


200a -t4cos0) «»—» . 
4(1.578 + 1.688 соѕ0 + 0.25 соѕ 20) 


A plot of |G(e/?)| versus 0 then leads to the amplitude spectrum of Figure 8.23. 


IG(ei9)I 


In Example 8.18 we note the periodic behaviour of the amplitude spectrum, which 
is inescapable when discrete-time signals and systems are concerned. Note, however, 
that the periodicity is in the variable @ = wT and that we may have control over the 
choice of 7, the time between samples of our input signal. 


The discrete Fourier transform 


The Fourier transform of sequences discussed in Section 8.6.2 transforms a sequence 
{g,} into a continuous function G(e!) of a frequency variable 0, where Ө = œT and T 
is the time between signal samples. In this section, with an eye to computer require- 
ments, we look at the implications of sampling G(e!®). The overall operation will have 
commenced with samples of a time signal {g,} and proceeded via a Fourier transforma- 
tion process, finally producing a sequence {G,} of samples drawn from the frequency- 
domain image G(e!) of {g,}. 

Suppose that we have a sequence {g,} of N samples drawn from a continuous-time 
signal g(f), at equal intervals T; that is, 


{20 = {2(kT) jio 
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Using (8.67), the Fourier transform of this sequence is 


F{g,} = Ge?) = У pit" (8.75) 


п=—оо 


where 0; = 0 (к # [0, № — 1]). Then, with 0 — cT, we may write (8.75) as 


N-1 
Ge) = Y g, gp (8.76) 


п=0 


We now sample this transform G(e!®’) at intervals A@ in such a way as to create 
N samples spread equally over the interval 0 < 0 < 27; that is, over one period of the 
essentially periodic function G(e/^). We then have 

МЛӨ = 2n 


where A0 1s the normalized frequency spacing. Since Ө = оТ апа T' is a constant such 
that A0 = T Aa, we deduce that 


Aw = 2% (8.77) 
NT 
Sampling (8.76) at intervals A@ produces the sequence 
N-1 | 
{Gho where С,= У g,e "^7 (8.78) 
п=0 
Since 
(c = У g, grin N)AnT 
п=0 
N-1 . : 
- У g,e "^ "Tg using (8.77) 
п=0 


N-1 
—jnkAwT 
aem 
n=0 


it follows that the sequence {G,}"_ is periodic, with period N. We have therefore gen- 
erated a sequence of samples in the frequency domain that in some sense represents 
the spectrum of the underlying continuous-time signal. We shall postpone the question 
of the exact nature of this representation for the moment, but as the reader will have 
guessed, it is crucial to the purpose of this section. First, we consider the question of 
whether, from knowledge of the sequence {G} of (8.78), we can recover the 


original sequence 1g, V . To see how this can be achieved, consider a sum of the form 


N-1 . 
5,= ў G eet. W-h<r=0 (8.79) 


k=0 
Substituting for G, from (8.78), we have 


N-1 N-1 


—jmkA@T e MANT. —jkAo (m*r)T 
5, = (к |: Ene 
k= 


k=0 m=0 
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That 1s, on interchanging the order of integration, 


N-0. Nl. 
S, = У £» У, е лент (8.80) 
т=0 k=0 
Now 
Y p eure 
k=0 
is a geometric progression with first term e° = 1 and common ratio e 7^?"*77, and so the 
sum to N terms is thus 
N-1 —jAo (m*r)NT —)(т+г)2т 
-jAo(m*r)T __ 1—е _ 1-е _ Em 
2" = Te NT ~ l- NITE =0 (m#-r+nN) 
When m = -=r 
NI о . -1 
bx Aq@(m+r)T = 1 = N 
k=0 k=0 
Thus 
N- . | 
by ы (8.81) 
k=0 
where 6, is the Kronecker delta defined by 
i ae 
&c (i=j) 
0 (i#/) 
Substituting (8.81) into (8.80), we have 
N-1 
5, E NY BO ran = Ng., 
m=0 
Returning to (8.79) and substituting for S, we see that 
DS jrAoT 
= e —jxrAo 
S-, n [23 Се 
k=0 
which on taking n = -r gives 
I jknAoT 
==) Ge” 8.82 
8 м2, k ( ) 


Thus (8.82) allows us to determine the members of the sequence 
{2,6 


that is, it enables us to recover the time-domain samples from the frequency-domain 
samples exactly. 


Example 8.19 


Solution 
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The relations 


N-1 i 
су (8.78) 
n=0 
EC E jnkA@T 
= У Ge (8.82) 
N k-0 


with Aw = 2n/NT, between the time- and frequency-domain sequences {е„ and 


{ G, define the discrete Fourier transform (DFT) pair. The pair provide pathways 
between time and frequency domains for discrete-time signals in exactly the same sense 
that (8.15) and (8.16) defined similar pathways for continuous-time signals. It should 
be stressed again that, whatever the properties of the sequences {g,,} and {G,} on the 
right-hand sides of (8.78) and (8.82), the sequences generated on the left-hand sides 
will be periodic, with period N. 


The sequence { 01}: 7 (1, 2, 1) is generated by sampling a time signal g(7) at intervals 
with T = 1. Determine the discrete Fourier transform of the sequence, and verify that 
the sequence can be recovered exactly from its transform. 


From (8.78), the discrete Fourier transform sequence 1 G,) , is generated by 


2 : 
б. = У ве" (k= 0,1,2) 


п=0 
In this case T= | and, with N = 3, (8.77) gives 


Ao = 2% = 24 
3x1 ? 


Thus 


2 : 2 
Go = YY ge У р, дунан 1+2+1=4 


n=0 n=0 


2 
—jnx1x2n/3 =} =j zi =j 
С, = > Bre Јпхіх2п = gael + g eT + g, e73 = 14+2e j2n/3 | le j4n/3 
п=0 


= e 3275 (е??”З + Pr e 205) =? e 3273 (1 + cos ? п) = e 1273 


2 2 
_ -jnx2x2n/3 -j4m3 . 0 —j4n/3 —j81/3 
G;- M ge = Vane = ge + get? + ge 
n=0 


n=0 
= ейт? [e ^"^ 424 gi] = 2 ей" (1 + соѕ 4 т) = ELE 
Thus 
TO, a = {4, ens e 5) 
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8.6.4 


We must now show that use of (8.82) will recover the original sequence {g,};_-). From 
(8.82), the inverse transform of {G,};_ is given by 


Ly jknAo T 
ё„= a G,e а 
ЖЕ 


again with T= 1, Ao — 2 t and N — 3. Thus 


2 2 

~ el tens zd at -j27/3 —}4т/3 

ёо =; у, Ое =;} G r=3(4+e +e ) 
k=0 


k=0 
= i [4 + e (е! + еу] 2 1 (4-2 cos x) 21 


2 
- jkx1x2n/3 j j 
=j Y G, 07777 (бон бей"? + бей") 


k=0 
=;(44+1+4+1)=2 


2. 
~ jkx2x2n/3 | j4n/3 ј8л/3 
& -1Y Ge = 1 (G, 4 Ge? + Gye) 
k=0 


= 104+ ее" е8) = 1(4-2cosi)-1 





That is 
t£, V n-0 ={1, 2, 1}= {2.160 


and thus the original sequence has been recovered exactly from its transform. 


We see from Example 8.19 that the operation of calculating N terms of the transformed 
sequence involved N x N = N? multiplications and N(N — 1) summations, all of which 
are operations involving complex numbers in general. The computation of the discrete 
Fourier transform in this direct manner is thus said to be a computation of complexity 
N’. Such computations rapidly become impossible as N increases, owing to the time 
required for this execution. 


Estimation of the continuous Fourier transform 


We saw in Section 8.4.2 that the continuous Fourier transform provides a means of 
examining the frequency response of a stable linear time-invariant continuous-time 
system. Similarly, we saw in Section 8.6.2 how a discrete-time Fourier transform could 
be developed that allows examination of the frequency response of a stable linear time- 
invariant discrete-time system. By sampling this latter transform, we developed the 
discrete Fourier transform itself. Why did we do this? First we have found a way (at 
least in theory) of involving the computer in our efforts. Secondly, as we shall now 
show, we can use the discrete Fourier transform to estimate the continuous Fourier 
transform of a continuous-time signal. To see how this 1s done, let us first examine 
what happens when we sample a continuous-time signal. 

Suppose that /(f) is a non-periodic continuous-time signal, a portion of which is 
shown in Figure 8.24(a). Let us sample the signal at equal intervals 7, to generate the 
sequence 


Figure 8.24 

(a) Continuous- 
time signal f(t); 
(b) samples drawn 
from f(t). 
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f 


о t 
(a) 


fo 


N. 
O T NT 


(b) 


(А0), ЛТ), ..., ДИТ), ... ) 


as shown in Figure 8.24(b). Imagine now that each of these samples is presented in turn, 
at the appropriate instant, as the input to a continuous linear time-invariant system with 
impulse response A(t). The output would then be, from Section 5.6.6, 


оо 


y(t) -| h(t — aromas | һ(ї— т)/(т)б(т- Т)ат 


—оо 


Fous | A(t — t)f(nT)6(t—nT)dt+... 


= пи V, f(T)àG - kT) dt 


Thus | | 

y(t) = | | h(t — 1)f (7) dv (8.83) 
where | 

fK0 7 Y, f&T)8Q - kr) - ft) V óQ - kr) (8.84) 


k=0 k=0 


which we identify as a ‘continuous-time’ representation of the sampled version of f(t). 
We are thus led to picture f.(/) as in Figure 8.25. 
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Figure 8.25 
Visualization of f,(t) 
defined in (8.84). 





In order to admit the possibility of signals that are non-zero for t « 0, we can 
generalize (8.84) slightly by allowing in general that 


f«0 - ft) Y, 8q - kr) (8.85) 
=e 


We can now use convolution to find the Fourier transform F(j0) of f.(t). Using the 
representation (8.85) for f(t), we have 


Ro) - FSD) = sno У 50 - z 


К=—оо 


which, on using (8.61), leads to 


F(jo) = i. FGo) F 4 У зе Kn} (8.86) 
К=—оо 
where 
Jf); - Fo) 
From (8.54), 


7| $, a-n) }- 22 § a(o Eu 


so that, assuming the interchange of the order of integration and summation to be 
possible, (8.86) becomes 


Ko) = d FGo) «28 $ a(o - 224) 


k=-00 


j| Flo- У ёо - m do’ 


ic a : 2 2nk|. 
- LF | F(jlo- o Dé(o - 2E Ja 


6-80) 
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Thus 


= 21 


7 (8.87) 


Бо) = 1 Y Fülo- ke]. ® 


[a 


Examining (8.87), we see that the spectrum F(jq@) of the sampled version f.(1) of 
ft) consists of repeats of the spectrum F(j«) of f(t) scaled by a factor 1/T, these repeats 
being spaced at intervals @) = 2r/T apart. Figure 8.26(a) shows the amplitude spectrum 


Figure 8.26 

(a) Amplitude 
spectrum of a band- 
limited signal f(t); 
(b)—(e) amplitude 
spectrum |F, (jæ)| 
of f(t), showing 
periodic repetition 
of [F,(ja)| and (a) 
interaction effects 
as T increases. 








-2n/T -9m o Om — 2n/T uu 








—Um 


лт © жт" e 
(d) 





0) 


-Wm -2n/T O 2m/T 9m 


(e) 
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Figure 8.27 
Sampling of a 
continuous-time signal. 


|F(jq@)| of a band-limited signal f(A); that is, a signal whose spectrum is zero for | 0| 7 (9. 
Figures 8.26(b—e) show the amplitude spectrum |£(j0)| of the sampled version for 
increasing values of the sampling interval 7. Clearly, as T increases, the spectrum of 
F(j@), as observed using |/{(j@)| in —@,,< @ <@,,, becomes more and more misleading 
because of ‘interaction’ from neighbouring copies. 

As we saw in Section 8.6.2, the periodicity in the amplitude spectrum |F(j@)| of £(A 
is inevitable as a consequence of the sampling process, and ways have to be found to 
minimize the problems it causes. The interaction observed in Figure 8.26 between the 
periodic repeats is known as aliasing error, and it is clearly essential to minimize this 
effect. This can be achieved in an obvious way if the original unsampled signal /(f) is 
band-limited as in Figure 8.26(a). It is apparent that we must arrange that the periodic 
repeats of |F(j@)| be far enough apart to prevent interaction between the copies. This 
implies that we have 


Oy = 20, 

at an absolute (and impractical!) minimum. Since @, = 27/T, the constraint implies that 
T «€ п/о, 

where T is the interval between samples. The minimum time interval allowed is 
Tin = 9/0, 


which is known as the Nyquist interval and we have in fact deduced a form of the 
Nyquist-Shannon sampling theorem. If 7 — T in then the ‘copies’ of F(jq@) are 
isolated from each other, and we can focus on just one copy, either for the purpose of 
signal reconstruction, or for the purposes of the estimation of F(j@) itself. Here we are 
concerned only with the latter problem. Basically, we have established a condition 
under which the spectrum of the samples of the band-limited signal f(A, that is the 
spectrum of f(t), can be used to estimate F(j0). 

Suppose we have drawn N samples from a continuous signal f(t) at intervals T, in 
accordance with the Nyquist criterion, as in Figure 8.27. We then consider 


f40 7 Y, fér)àq - kr) 


k-0 
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or equivalently, the sequence 
(Ао, where f,—f(kT) 
Note that 
4-0 (t»Q-1T7) 
so that 
fe=9 (k>N-1) 
The Fourier transform of f,(f) is 


оо 


-1 


F(j@) -| ft) e" dt -| Y f(kT)8(t — kT) e?" dr 


—оо k 


оо 


| f(kT)Ó(t — kT) e?" dr 


= 


-1 


> 
1 
© 


N- . N-1 y 
— f(kT) EL = У x e 0T (8.88) 
k=0 


k=0 


The transform in (8.88) is a function of the continuous variable @, so, as in (8.78), we 
must now sample the continuous spectrum F,(j@) to permit computer evaluation. 

We chose N samples to represent f(f) in the time domain, and for this reason we also 
choose N samples in the frequency domain to represent F(j@). Thus we sample (8.88) 
at intervals Ao, to generate the sequence 


UR Gn A0) 0 (8.892) 
where 
N-1 P 
F(jnAo) 2 V fe ™ "T (8.89b) 
k=0 


We must now choose the frequency-domain sampling interval Aw. To see how to do 
this, recall that the sampled spectrum K(j0) consisted of repeats of F(j«), spaced at 
intervals 2x/T apart. Thus to sample just one copy in its entirety, we should choose 


N^o = 2т/Т 
ог 
Ав = 2n/NT (8.90) 


Note that the resulting sequence, defined outside 0 S n < N — 1, 1s periodic, as 
we should expect. However, note also that, following our discussion in Section 8.6, 
the process of recovering a time signal from samples of its spectrum will result in 
a periodic waveform, whatever the nature of the original time signal. We should not be 
surprised by this, since it is exactly in accordance with our introductory discussion in 
Section 8.1. 

In view of the scaling factor 1/T in (8.87), our estimate of the Fourier transform 
F(j0) of f(t) over the interval 


0=г<=(М—1)Т 
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will, from (8.89), be the sequence of samples 
{ТЕ јпАо) Jyo 
where 
N-1 | 
TR(jndo) = ТУ f,e 
k=0 
which, from the definition of the discrete Fourier transform in (8.78), gives 
TE(jnAq) = T x DFT {f} 


where DFT { f,} is the discrete Fourier transform of the sequence { f,}. We illustrate the 
use of this estimate in Example 8.20. 


Example 8.20 The delayed triangular pulse f(A) is as illustrated in Figure 8.28. Estimate its Fourier 
transform using 10 samples and compare with the exact values. 


Figure 8.28 fo 
The delayed 
triangular pulse. 


0.5 


Solution Using N= 10 samples at intervals T= 0.2 s, we generate the sequence 


{feti-o = {£0), (0.2), (0.4), f(0.6), f(0.8), f(0.0), f(1.2), fUL-4), fU.6), £18} 
Clearly, from Figure 8.28, we can express the continuous function f(A) as 
t (0 x t « 0.5) 
füe4l-£ (05<7< 1) 
0 (t= 1) 
and so 
{fbi = {0, 0.2, 0.4, 0.4, 0.2, 0, 0, 0, 0, 0} 


Using (8.78), the discrete Fourier transform {F,}?_, of the sequence { f,};- is 
generated by 


9 
Е, = f. giant Where hove 2n _ 2n i 
NT 10х02 





k-0 
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That is, 
2 jkn(0.2 
к= у,Ле 
k=0 
Or, since f, = f; = fs =f 9 fs — fo — 0, 
: jnk(0.2n) 
Е, = > ti е?” 1 LT 
k=l 


The estimate of the Fourier transform, also based on N = 10 samples, is then the 
sequence 


UF, Vias 7 (02F, Eo 
We thus have 10 values representing the Fourier transform at 
Q-n^o (n-0,1,2,...,9) 
or since Ao — 21/NT 
@= 0, T, 2T, ..., 9T 
At © = T, corresponding to n = 1, our estimate is 
4 
0.2F, 2 0.2 V е“ 
k=1 
— 0.2[0.2 e 3029 4 Q,4(e 3049 + e0) + 0.2 e08] 
= —0.1992] 
Аї 0 = 2n, corresponding to n = 2, our estimate is 
4 : 
02Р, =02У fi po 
k=1 
= 0.2[0.2 e210 + 0.4(e308 + e020) + 0.2 e0] 
= —0.1047 
Continuing in this manner, we compute the sequence 
(0.2F, 0.25, ..., 0.25) 
as 


(0.2400, —0.1992j, 0.1047, 0.0180j, —0.0153, 0, —0.0153, —0.0180j, 
—0.1047, 0.1992j! 


This then represents the estimate of the Fourier transform of the continuous function 
f(t). The exact value of the Fourier transform of f(t) is easily computed by direct use of 
the definition (8.15) as 
. _ jol . 
F(jo) - 9(f()) 2 1 e*"sinc?! o 


which we can use to examine the validity of our result. The comparison is shown in 
Figure 8.29 and illustrated graphically in Figure 8.30. 
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Figure 8.29 
Comparison of 
exact results and 
DFT estimate for the 
amplitude spectrum 
of the signal of 
Example 8.20. 


Figure 8.30 

Exact result | F(j@) | 
(x) and DFT 
estimate TF, (L1) 
of the Fourier 
transform in 
Example 8.20. 

















0 Exact F(j@) DFT estimate |F(jo)| |DFT estimate| % error 
0 0.2500 0.2400 0.2500 0.2400 4% 
T —0.2026j —0.1992j 0.2026 0.1992 1.796 
2n —0.1013 —0.1047 0.1013 0.1047 3.296 
Зл 0.0225] 0.0180] 0.0225 0.0180 20% 
4т 0 —0.0153 0 0.0153 - 

5л —0.0081j 0 0.0081 0 - 

бл —0.0113 —0.0153 0.0113 0.0153 - 

7л 0.0041 —0.0180j 0.0041 0.0180 - 

8л 0 —0.1047 0 0.1047 -= 

on —0.0025j 0.1992j 0.0025 0.1992 - 

Ш (јо), 


ТЕ, 


0.2 


0.1 





о n m Зл 4n Sn n m Bn Mm V 


From the Nyquist-Shannon sampling theorem, with T = 0.2s, we deduce that our 
results will be completely accurate if the original signal f(f) is band-limited with a zero 
spectrum for |@| > |@,,| = 52. Our signal is not strictly band-limited in this way, and 
we thus expect to observe some error in our results, particularly near @ = 57, because 
of the effects of aliasing. The estimate obtained is satisfactory at m = 0, m, 2m, but 
begins to lose accuracy at @= 3m. Results obtained above w= 57 are seen to be images 
of those obtained for values below @ = 5m, and this is to be expected owing to the 
periodicity of the DFT. In our calculation the DFT sequence will be periodic, with 
period № = 10; thus, for example, 


ТР, = ITF; i] = ITF 4| Е ТІРЕ] 


As we have seen many times, for a real signal the amplitude spectrum is symmetric about 
Q — 0. Thus |£4| 2 |/|, 1551 = 15], and so on, and the effects of the symmetry are 
apparent in Figure 8.29. It is perhaps worth observing that if we had calculated (say) 
(EA TE ..., TF, TR, ..., TF), we should have obtained a ‘conventional’ plot, with 
the right-hand portion, beyond o - 57, translated to the left of the origin. However, 
using the plot of the amplitude spectrum in the chosen form does highlight the source 
of error due to aliasing. 


In this section we have discussed a method by which Fourier transforms can be 
estimated numerically, at least in theory. It is apparent, though, that the amount of labour 
involved is significant, and as we observed in Section 8.6.3 an algorithm based on this 
approach is in general prohibitive in view of the amount of computing time required. 
The next section gives a brief introduction to a method of overcoming this problem. 


8.6.5 
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The fast Fourier transform 


The calculation of a discrete Fourier transform based on N sample values requires, as 
we have seen, N? complex multiplications and N(N — 1) summations. For real signals, 
symmetry can be exploited, but for large N, 1 N? does not represent a significant improve- 
ment over N? for the purposes of computation. In fact, a totally new approach to the 
problem was required before the discrete Fourier transform could become a practical 
engineering tool. In 1965 Cooley and Tukey introduced the fast Fourier transform 
(FFT) in order to reduce the computational complexity (J. W. Cooley and J. W. Tukey, 
An algorithm for the machine computation of complex Fourier series, Mathematics of 
Computation 19 (1965) 297—301). We shall briefly introduce their approach in this 
section: for a full discussion see E. E. Brigham, The Fast Fourier Transform (Prentice 
Hall, Englewood Cliffs, NJ, 1974), whose treatment 1s similar to that adopted here. 

We shall restrict ourselves to the situation where N = 2" for some integer y, and, 
rather than examine the general case, we shall focus on a particular value of y. In 
proceeding in this way, the idea should be clear and the extension to other values of 
y appear credible. We can summarize the approach as being in three stages: 


(a) matrix formulation; 
(b) matrix factorization; and, finally, 
(c) rearranging. 


We first consider a matrix formulation of the DFT. From (8.78), the Fourier trans- 
form sequence 1G, 1, of the sequence 1g, M is generated by 


N-1 . 
6 = У sg, ее (20,1... N- D) (8.91) 
n=0 
We shall consider the particular case when y= 2 (that is, N = 2? = 4), and define 
W= e UN — elt? 
so that (8.91) becomes 
N-1 3 
б,= у в = уы" (#=0,1,2,3) 
п=0 п=0 
Writing out the terms of the transformed sequence, we have 
G= 800° + gy W° + g;W? + g,W? 
G= gW" +g W' + gW + gW? 
G, = gyW? - g,.W?  g,W* + gW" 
G; = gyW? - g,.W? * g,W$ -- g,W? 
which may be expressed in the vector-matrix form 


G, и р wW || ео 


1 2 3 


o 


G W W W W 

|= 0 2 4 6 (8.92) 
С, W И W Wig; 
G, W* w! w* W|g 
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or, more generally, as 
G, М” п 
where the vectors G, and g, and the square matrix W” are defined as in (8.92). The 


next step relates to the special properties of the entries in the matrix W”. Note that 
w" = w"?’, where p is an integer, and so 


Wt=W=1 
We=W? 
9 = у! 
Thus (8.92) becomes 
Go 1 1 Xd vibe 
G 1 wW w w 
| _ О 1 (8.93) 
G, | m ww wig 
б, 1 W w! w'ig 


Equation (8.93) is the end of the first stage of the development. In fact, we have so far 
only made use of the properties of the Nth roots of unity. Stage two involves the 
factorization of a matrix, the details of which will be explained later. 


Note that 
1 wW о ото р о l X. 4 
1 ^ 00/|01 0 WwW) 1 Ww w*' yw (8.94) 
0. 0 1 W'|10 Ww? o 1 Ww! w)Ó w ` 
оо 1 o1 0 P 1 wW ww w 


where we have used W* = W' and W? — 1 (in the top row). The matrix on the right-hand 
side of (8.94) is the coefficient matrix of (8.93), but with rows 2 and 3 interchanged. 
Thus we can write (8.93) as 


Go 1 Ww o ofri 0 w olg 
2 0 
С, Е 1 W 0 И 0 1 T W || (8.95) 
С, 0. 0 1 W||10 W Oj|g 
G; 0 0 1 jo 1 O0 Wig 
We now define a vector g’ as 
gi 10 W* 0| g 
, 0 
$us 8 E 0 1 ul Wig (8.96) 
g 10 W O|g 
gi ото р 93 


It then follows from (8.96) that 
go =g + Wg 
gi =e, + W's; 
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so that ej and gj are each calculated by one complex multiplication and one addition. Of 

course, in this special case, since JW? — 1, the multiplication is unnecessary, but we are 

attempting to infer the general situation. For this reason, W° has not been replaced by 1. 
Also, it follows from (8.96) that 


8 =2)+ Wg, 
8 =2,+ We, 


and, since W? = —W’°, the computation of the pair g} and g; can make use of the com- 
putations of W°g, and W°g,, with one further addition in each case. Thus the vector g’ 
is determined by a total of four complex additions and two complex multiplications. 

To complete the calculation of the transform, we return to (8.95), and rewrite it in 
the form 


гро о |е 

С 1 wW 0 0 / 
кш 0 1 vill: m9 

1 82 

3 , 

б, 0 0 1 Wiig 


It then follows from (8.97) that 
Gy = go * W'gi 
G,- gi * W'gi 
and we see that Gy is determined by one complex multiplication and one complex addition. 


Furthermore, because W? — —W"?, G, follows after one further complex addition. 
Similarly, it follows from (8.97) that 


G, = 92 + Wgi 
G; =g; + Wg; 


and, since W? = —W', a total of one further complex multiplication and two further 
additions are required to produce the re-ordered transform vector 


[G G G GJ] 


Thus the total number of operations required to generate the (re-ordered) transform is four 
complex multiplications and eight complex additions. Direct calculation would have 
required N? 2 16 complex multiplications and N(N — 1) = 12 complex additions. Even 
with a small value of N, these savings are significant, and, interpreting computing time 
requirements as being proportional to the number of complex multiplications involved, 
it is easy to see why the FFT algorithm has become an essential tool for computational 
Fourier analysis. When N = 2”, the FFT algorithm is effectively a procedure for produc- 
ing yN x N matrices of the form (8.94). Extending our ideas, it is possible to see that 
generally the FFT algorithm, when N = 2”, will require i Ny (four, when N = 2? = 4) 
complex multiplications and Ny (eight, when N = 4) complex additions. Since 


y = log}, N 


the demands of the FFT algorithm in terms of computing time, estimated on the basis 
of the number of complex multiplications, are often given as about N log, N, as opposed 
to I? for the direct evaluation of the transform. This completes the second stage of our 
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Example 8.21 


Solution 


task, and we are only left with the problem of rearrangement of our transform vector 
into ‘natural’ order. 

The means by which this is achieved is most elegant. Instead of indexing Gp, G,, 
G., G, in decimal form, an alternative binary notation is used, and [G G; G, G] 
becomes 


[Goo Goi Gio Gul 


The process of “bit reversal’ means rewriting a binary number with its bits or digits in 
reverse order. Applying this process to [Gg Gy: Gio Gy]. yields 


[Go Gio Gor Gal = [G G G С]! 


with decimal labelling. This latter form is exactly the one obtained at the end of the FFT 
calculation, and we see that the natural order can easily be recovered by rearranging the 
output on the basis of bit reversal of the binary indexed version. 

We have now completed our introduction to the fast Fourier transform. We shall now 
consider an example to illustrate the ideas discussed here. We shall then conclude by 
considering in greater detail the matrix factorization process used in the second stage. 


Use the method of the FFT algorithm to compute the Fourier transform of the sequence 


tesa x (1, 2, l; 0} 


^ 


In this case N = 4 = 2’, and we begin by computing the vector g’ =[g, gi $25 gil. 
which, from (8.96), is given by 


10 W о |е 
a 01 0 W'|g 
" 10 W^ 0 ||, 
01 0 W|g 
For N=4 
w= (e? ^y = ет? 
and so 
10 1! 0|1 2 
‚|01 0 1\{|2 2 
2. = = 
1 0 -1 olli 0 
0 1 O0 -1|[0 2 


Next, we compute the ‘bit-reversed’ order transform vector G’, say, which from (8.97) 
is given by 


© 


N 


w 
- 


о ьо н н 

© с Ж ® 

- — Oo c 

Xoo 
0%, 
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or, in this particular case, 


Goo || 1 10 0112 4 

Gs Go 1 -10 0021 _ 0 (8.98) 
G4] 0 0 1 —)||0 —2] 
Сп110 0 1 jl|2 2j 


Finally, we recover the transform vector G—- [С С С ©] аѕ 


and we have thus established the Fourier transform of the sequence (1, 2, 1, 0) as the 
sequence 


{4, —2j, 0, 2j} 


It is interesting to compare the labour involved in this calculation with that in 
Example 8.19. 


To conclude this section, we reconsider the matrix factorization operation, which is 
at the core of the process of calculating the fast Fourier transform. In a book of 
this nature it is not appropriate to reproduce a proof of the validity of the algorithm 
for any N of the form N = 2”. Rather, we shall illustrate how the factorization we intro- 
duced in (8.94) was obtained. The factored form of the matrix will not be generated 
in any calculation: what actually happens is that the various summations are performed 
using their structural properties. 

From (8.91), with JW 2 e?""*, we wish to calculate the sums 


N-1 
G,- Y gW" k-0,1...,N-1 (8.99) 
п=0 


In the case N= 4, y= 2 we see that k and n take only the values 0, 1, 2 and 3, so we 
can represent both k and n using two-digit binary numbers; in general y-digit binary 
numbers will be required. 

We write k = kikọ and n = n,n, where Ку, Ку, ny and n, may take the values 0 ог 1 
only. For example, k= 3 becomes & = 11 and n = 2 becomes n = 10. The decimal form 
can always be recovered easily as k= 2k, + ky and n = 2n, + no. 

Using binary notation, we can write (8.99) as 


1 1 
Gk, = Ў; у Bring 


ng-0 n;=0 


(2n *ng)(2K, kg) 


(8.100) 


The single summation of (8.99) is now replaced, when y = 2, by two summations. Again 
we see that for the more general case with N — 2" a total of y summations replaces the 
single sum of (8.99). 
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The matrix factorization operation with which we are concerned is now achieved by 
considering the term 
(2n, *ng)(2k, ko) 
у 1'^70 p^ 
in (8.100). Expanding gives 


(2ny+n)(2k, +k) (2ky+ko)2ny yz (2k, +kq) Mg 
= W Д0 Uy 


Anik, 2nyky (2k, +kp)ng 


=W `W Ww (8.101) 


Since JW — e?" and N = 4 in this case, the leading term in (8.101) becomes 


р" 1 -j2n/4. 4n 4 -j2n, nik, 


=(е ) =(@е ) 


Again we observe that in the more general case such a factor will always emerge. 
Thus (8.101) can be written as 


(2n *ng)(2k, kg) 2nyko у“ тю” 


=W 
so that (8.100) becomes 
wo jp ^ ft 
Grn = Y 5 Enn W (8.102) 
ng-0 | n,-0 


which is the required matrix factorization. This can be seen by writing 


, 
Sky — Snn 
070 1 I 


пу=0 


wo 


(8.103) 


so that the sum in the square brackets in (8.102) defines the four relations 


Zo = go W^" ^. gu = go + ZW’ 

gi = Zoa W WM gu W^ ^ = Zo + gu W’ 8.104 
Poa 2.0.1 211 2 (8.104) 

810 = 80007 * gil = Zo + oW 

£n = So wee + gwen = ou t gu W’ 


which, in matrix form, becomes 


Zoo 1 0 W 0 S00 
, 1 w’ 
ey 0 es Su (8.105) 
S10 10 W 0 S10 
gu 0 1 0 р? gu 


and we see that we have re-established the system of equations (8.96), this time with 
binary indexing. Note that in (8.104) and (8.105) we distinguished between terms in W° 
depending on how the zero is generated. When the zero is generated through the value 
of the summation index (that is, when n, = 0 and thus a zero will always be generated 
whatever the value of y) we replace W° by 1. When the index is zero because of the 
value of kọ, we maintain JW? as an aid to generalization. 
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The final stage of the factorization appears when we write the outer summation of 
(8.102) as 


(2, t ko)ng 


1 
Ск, = > Shan W 


(8.106) 
пу=0 
which, on writing out in full, gives 
Go, — gu W ^^ + goa W° = Zoo + goa W’ 
Си = GoW? + ga W’! = goo + goa W? 
Сто = о + а = + еп! 
Gi, 7 giW?^? € gu W?! — gy BW? 
or, in matrix form, 
Gi, 1 W' 0 0| go 
Ch 1 W 0 Off gi 
G,| |0 0 1 W| g, кн) 
Gi 0 0 1 Wil gt 


The matrix in (8.107) is exactly that of (8.97), and we have completed the factorization 
process as we intended. Finally, to obtain the transform in a natural order, we must 
carry out the bit-reversal operation. From (8.102) and (8.105), we achieve this by simply 
writing 


Gk, = Gi, (8.108) 


We can therefore summarize the Cooley-Tukey algorithm for the fast Fourier trans- 
form for the case N — 4 by the three relations (8.103), (8.106) and (8.108), that 1s 


1 
2n,k 
> 1^0 
Ekono EE у Е 


пу=0 
1 
5 E , 
Gi, = Ekono 
0 


пу= 


о 


= ^ 
Gri = Cnr 


1^0 


The evaluation of these three relationships is equivalent to the matrix factorization 
process together with the bit-reversal procedure discussed above. 

The fast Fourier transform is essentially a computer-orientated algorithm and highly 
efficient codes are available in MATLAB and other software libraries, usually requiring 
a simple subroutine call for their implementation. The interested reader who would 
prefer to produce ‘home-made’ code may find listings in the textbook by Brigham 
quoted at the beginning of this section, as well as elsewhere. 
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8.6.6 Exercises 


28 Calculate directly the discrete Fourier transform of К=46 +26 +, Kk-0orl foralli 


the sequence 


п= 4п, + 2п +по, п, = О ог1 гаі 


{1, 0, 1, 0} 
to show that 
using the methods of Section 8.6.3 (see Example 8.19). 
1 

29 Use the fast Fourier transform method to calculate Bru = E н jy 

the transform of the sequence of Exercise 28 n3-0 

(follow Example 8.21). 

А : , (2k, kg)2n, 

30 Use the FFT algorithm in MATLAB (or an Блу = у Eionn W 


T alternative) to improve the experiment with 
the estimation of the spectrum of the signal of 


Example 8.20. 


31 Derive an FFT algorithm for N = 2° = 8 points. 


пу=0 
(4,+2к1+К0)по 


1 
, _ " 
Gi kk, 7 У Bin Wi 


ng-0 


Work from (8.99), writing Сыны = Gt kk, 
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We. о We. w 


Figure 8.31 Amplitude 
response of an ideal 
low-pass filter. 


In this section we explore the ideas of mathematical design or synthesis. We shall 
express in mathematical form the desired performance of a system, and, utilizing the 
ideas we have developed, produce a system design. 

This chapter has been concerned with the frequency-domain representation of 
signals and systems, and the system we shall design will operate on input signals 
to produce output signals with specific frequency-domain properties. In Figure 8.31 we 
illustrate the amplitude response of an ideal low-pass filter. This filter passes perfectly 
signals, or components of signals, at frequencies less than the cut-off frequency o. 
Above 0, attenuation 1s perfect, meaning that signals above this frequency are not 
passed by this filter. 

The amplitude response of this ideal device is given by 


1 = 
ГО -d (jal = @,) 
0 (j@| > @,) 


Such an ideal response cannot be attained by a real analogue device, and our design 
problem is to approximate this response to an acceptable degree using a system that 
can be constructed. A class of functions whose graphs resemble that of Figure 8.31 is 
the set 


1 


ае 
JE 0 (0/0,) ] 


and we see from Figure 8.32, which corresponds to @, = 1, that, as n increases, the 
graph approaches the ideal response. This particular approximation is known as the 
Butterworth approximation, and is only one of a number of possibilities. 


Figure 8.32 
Amplitude responses 
of the Butterworth 
filters. 
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To explore this approach further, we must ask the question whether such a response 
could be obtained as the frequency response of a realizable, stable linear system. We 
assume that it can, although if our investigation leads to the opposite conclusion then 
we shall have to abandon this approach and seek another. If H,(j@) is the frequency 
response of such a system then it will have been obtained by replacing s with jo in the 
system Laplace transfer function. This is at least possible since, by assumption, we are 
dealing with a stable system. Now 


1 


| Hg Cj) |* күршесе 
1+ (j@/jo,) 


where |H,(j@)|? 7 Hg(jo) H5( jo). If Hg(s) is to have real coefficients, and thus be 
realizable, then we must have H5( jo) = H(—jo). Thus 


1 _ 1 


Tagan Hasan e rr чысы 
1 * (0/0) 1+ (j@/jo,) 


and we see that the response could be obtained by setting s = jæ in 


1 


Har) Hg e) n 
1+ (5/)®) 


Our task is now to attempt to separate H,(s) from H,(—s) in such a way that H,(s) 
represents the transfer function of a stable system. To do this, we solve the equation 


1 + (s/j@, °" = 0 
to give the poles of H,(s)H,(—s) as 
в = 0), ей 10721910) (= 0), 1,2,3,...) (8.109) 


Figure 8.33 shows the pole locations for the cases n — 1, 2, 3 and 5. The important 
observations that we can make from this figure are that in each case there are 27 poles 
equally spaced around the circle of radius c, in the Argand diagram, and that there are 
no poles on the imaginary axis. If s — s, is a pole of H&(s)Hg(—s) then so is s 2 —s,, and 
we can thus select as poles for the transfer function H,(s) those lying in the left half- 
plane. The remaining poles are then those of H,(—s). By this procedure, we have 
generated a stable transfer function H,(s) for our filter design. 

The transfer function that we have generated from the frequency-domain specification 
of system behaviour must now be related to a real system, and this is the next step 
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Figure 8.33 


Pole locations for the 


Butterworth filters: 


(О)п= (90) п = 2; 


(X) n= 3; 
(5) п= 8. 





in the design process. The form of the transfer function for the filter of order n can be 
shown to be 


n 


@ 
Hals) = = 
н (s — 8,)(S — 5)... (5 — Sa) 
where s;, $5, ... , s, are the stable poles generated by (8.109). The reader is invited to 


show that the second-order Butterworth filter has transfer function 
2 
O, 
н) = — — 3 
S + (20,5 + 0, 
Writing Y(s) = H,(s)U(s), with H,(s) as above, we obtain 
2 


0, 


MORS U(s) 


5 + (20,5 + 02 
or 
(s? + (20,5 0:)Y(s) 2 9; U(s) (8.110) 


If we assume that all initial conditions are zero then (8.110) represents the Laplace 
transform of the differential equation 


2 
с + (20,940 + озуб) = гий) (8.111) 
t 


This step completes the mathematical aspect of the design exercise. It is possible to 
show that a system whose behaviour is modelled by this differential equation can be 
constructed using elementary circuit components, and the specification of such a circuit 
would complete the design. For a fuller treatment of the subject the interested reader 
could consult M. J. Chapman, D. P. Goodall and N. C. Steele, Signal Processing in 
Electronic Communications, Horwood Publishing, Chichester, 1997. 

To appreciate the operation of this filter, the use of the Signal Processing Toolbox in 
MATLAB is recommended. After setting the cut-off frequency @,, at 4 for example, the 
output of the system y(t) corresponding to an input signal u(t) = sint+ sin 10¢ will demon- 
strate the almost-perfect transmission of the low-frequency (@= 1) term, with nearly total 
attenuation of the high-frequency (œ = 10) signal. As an extension to this exercise, the 
differential equation to represent the third- and fourth-order filters should be obtained, 
and the responses compared. Using a simulation package and an FFT coding, it is possible 
to investigate the operation of such devices from the viewpoint of the frequency domain 
by examining the spectrum of samples drawn from both input and output signals. 
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8.8 Engineering application: modulation, demodulation and 


8.8.1 


Figure 8.34 
‘MATLAB’ M-file 
demonstrating 
frequency-domain 
filtering using the 

fast Fourier transform. 


frequency-domain filtering 


Introduction 


In this section we demonstrate the practical implementation of modulation, demodula- 
tion and frequency-domain filtering. These are the processes by which an information- 
carrying signal can be combined with others for transmission along a channel, with 
the signal subsequently being recovered so that the transmitted information can be 
extracted. When a number of signals have to be transmitted along a single channel at 
the same time, one solution is to use the method of amplitude modulation as described 
in Section 8.3.4. We assume that the channel is ‘noisy’, so that the received signal 
contains noise, and this signal is then cleaned and demodulated using frequency- 
domain filtering techniques. This idea is easy to describe and to implement, but cannot 
usually be performed on-line in view of the heavy computational requirements. Our 
filtering operations are carried out on the frequency-domain version of the signal, and 
this is generated using the fast Fourier transform algorithm. The MATLAB code in 
Figure 8.34 is designed to illustrate how results can be obtained working from basic 
ideas. The nature and usefulness of the Toolboxes now associated with MATLAB have 
made it possible to work at a higher level. Nevertheless it is thought valuable to retain 
this figure for instructional purposes, since it is easily modified. (Note: In Figure 8.34 1 
is used instead of j to denote /—1, to conform with MATLAB convention.) 


% Demonstration of frequency domain filtering using the FFT. 
% 

% 

% Some MATLAB housekeeping to prevent memory problems! 
clear 

clg 

% 

% Select a value of N for the number of samples to be taken. 

% Make a selection by adding or removing % symbols. 

% N must be a power of 2. 

%N = 512; 

N = 1024; 

%N = 2048; 

%М = 4096; 

%N = 8192; 

% 

% T is the sampling interval and the choice of N determines the 
% interval over which the signal is processed. Also, if 

% N frequency domain values are to be produced the resolution 
% is determined. 

Т = 0.001; 

t = 0:T:(N — 1)#T; 

delw = 2*pi/(T*N); 

% 

% Generate the ‘information’ 

f=t .*exp(-t/2); 

% 
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Figure 8.34 continued 


% Set the frequency of the carriers, wc is the carrier which 
% will be modulated. 


We — 2*pi*50; 
wea = 2*pi*120; 
% 


% Perform the modulation ... 
x = f. *cos(wet) + cos(weast); 


% 
% ...and add channel noise here 
nfac = 0.2; 


rand(‘normal’); 

x =x + nfac*rand(t); 

% 

% Plot the ‘received’ time signal 

plot(t,x) 

title(‘The time signal, modulated carrier and noise if added’) 
xlabel(‘time, t’) 

ylabel(‘x(t)’) 

pause 

% 

% Calculate the DFT using the FFT algorithm ... 

y = fft(x); 

2 = T«abs(y); 

w = 0:delw:(N — 1)*delw; 

% 

% ... and plot the amplitude spectrum. 

plot(w,z) 

title(‘The amplitude spectrum. Spikes at frequencies of carriers’) 
xlabel(‘frequency, w’) 

ylabel(‘amplitude’) 

pause 

% 

% Construct a filter to isolate the information-bearing carrier. 
% 

% 2*hwind + 1 is the length of the filter ‘window’. 

% Set ffac to a value less than 1.0 ffac = 0.5 gives a filter 

% of half length wc/2 where wc is frequency of carrier. Don’t 
% exceed a value of 0.95! 

ffac = 0.5; 

hwind = round(ffac*wc/delw); 

] 2 2*hwind + 1; 

% 

% Set the centre of the window at peak corresponding to we. 
% Check this is ok by setting | = 1! 

11 = round(we/delw) — hwind; 

% 

% Remember that we must have both ends of the filter! 
mask = [zeros(1,11),ones(1,l),zeros(1,N — (2*1 + 2*11 — 1)),ones(1,1),zeros(1,11 — 1)]; 
% 

% Do the frequency domain filtering... 

zz = mask.*y; 


% 

% ... and calculate the inverse DFT 
yya = ifft(zz); 

% 


% Remove rounding errors ... it is real! 


yy = 0.5*(yya + conj(yya)); 
% 
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Figure 8.34 continued 


8.8.2 


% Plot the ‘cleaned’ spectrum with only lower carrier present. 
plot(w,T«abs(zz)) 

title(" Upper carrier eliminated and noise reduced") 
xlabel( frequency, w") 

ylabel(‘amplitude’) 

pause 

% 

% Now the signal is cleaned but needs demodulating so 
% form the product with 2*carrier signal ... 

dem = yy.*cos(wet); 

dem = 2*dem; 

% 

% ... and take the DFT. 

demft = fft(dem); 

% 

% Use a low-pass filter on the result, the length is lp. 
% The same factor is used as before! 

llp = round(ffac*we/delw); 

masklp = [ones(1,llp),zeros(1,N — (2*llp — 1)),ones(1,llp — 1)]; 
% 

% Carry out the filtering... 

op = masklp.*demft; 


% 
% ... and plot the DFT of filtered signal. 
plot(w,T*abs(op)) 


title(‘Result of demodulation and low-pass filtering’) 
xlabel(‘frequency, w’) 

ylabel(‘amplitude’) 

pause 

% 

% Return to the time domain... 

opta = ifft(op); 

opt = 0.5*(opta + conj(opta)); 

act ma: 

ур = №; 

% ... and finally plot the extracted signal vs the original. 
plot(t(1:vp),opt(1:vp),‘—’ ,t(1:vp),act(1:vp),*:”); 
title(‘The extracted signal, with original’) 
xlabel(‘time, t’) 

ylabel(‘f(t)’) 

pause 

% 

% Clean-up ... 

clg 

clear 

% 

% ... but responsibly! 

i= sqrt(—1); 

home 


Modulation and transmission 


We suppose that our ‘information’ consists of samples from the signal f(t) = te 


705 


1/2 
> 


taken at intervals T= 0.001 s. This signal, or more correctly, data sequence, will be used 
to modulate the carrier signal cos (50*2*7:*/). A second carrier signal is given by 
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Figure 8.35 
Time-domain 
version of 
noisy signal. 


2.0 


x (t) 





-2.0 


Time, t 


cos (120 *2*7:*1), and this can be thought of as carrying the signal f(t) = 1. We 
combine these two signals and add ‘white noise’ to represent the action of the channel. 
This part of the exercise corresponds to the signal generation and transmission part of 
the overall process, and Figure 8.35 shows the time-domain version of the resulting 


signal. 


8.8.3 Identification and isolation of the information- 
carrying signal 


Here we begin the signal-processing operations. The key to this is Fourier analysis, and 
we make use of the fast Fourier transform algorithm to perform the necessary trans- 
forms and their inverses. First we examine the spectrum of the received signal, shown 
in Figure 8.36. We immediately see two spikes corresponding to the carrier signals, and 
we know that the lower one is carrying the signal we wish to extract. We must design 
a suitable filter to operate in the frequency domain for the isolation of the selected 
carrier wave before using the demodulation operation to extract the information. To do 
this, we simply mask the transformed signal, multiplying by 1 those components we 
wish to pass, and by 0 those we wish to reject. Obviously we want to pass the carrier- 
wave frequency component itself, but we must remember that the spectrum of the informa- 
tion signal is centred on this frequency, and so we must pass a band of frequencies 
around this centre frequency. Again a frequency-domain filter is constructed. We thus 
have to construct a bandpass filter of suitable bandwidth to achieve this, and moreover, 
we must remember to include the right-hand half of the filter! There are no problems 
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Figure 8.36 
Spectrum of 
received signal. 
Spikes and 
frequencies of 
carriers. 


8.8.4 
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here with the Nyquist frequency — at first glance we simply need to avoid picking up 
the second carrier wave. However, the larger the bandwidth we select, the more noise 
we shall pass, and so a compromise has to be found between the necessary width for 
good signal recovery and noise elimination. Obviously, since we know the bandwidth 
of our information signal in this case, we could make our choice based on this know- 
ledge. This, however, would be cheating, because usually the exact nature of the trans- 
mitted information is not known in advance: if it were, there would be little point in 
sending it! In the M file we have set the half-length of the filter to be a fraction of the 
carrier frequency. The carrier frequency @, represents the maximum possible channel 
bandwidth, and in practice a channel would have a specified maximum bandwidth 
associated with it. Figure 8.37 shows the resulting spectrum after application of the 
bandpass filter, with a bandwidth less than o. 


Demodulation stage 


The purpose of this operation is to extract the information from the carrier wave, and 
it can be shown that multiplying the time signal by cos @,T, where @, is the frequency 
of the carrier wave, has the effect of shifting the spectrum of the modulating signal so 
that it is again centred on the origin. To perform the multiplication operation, we have 
to return to the time domain, and this is achieved by using the inverse FFT algorithm. In 
the frequency-domain representation of the demodulated signal there are also copies of 
the spectrum of the modulating signal present, centred at higher frequencies (2@,, 4@,), 
and so we must perform a final low-pass filtering operation on the demodulated signal. 
To do this, we return to the frequency domain using the FFT algorithm again. The 
result of the demodulation and low-pass filtering operations is shown in Figure 8.38. 
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8.8.5 Final signal recovery 


Frequency, o 


The last operation to be performed is to return to the time domain to examine what we 
have achieved. After calling the inverse FFT routine, the extracted signal is plotted 
together with the original for comparison. The results with a fairly low value for the 
added noise are shown in Figure 8.39. If the process is carried out in the absence of 
noise altogether, excellent signal recovery is achieved, except for the characteristic 


‘ringing’ due to the sharp edges of the filters. 
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8.8.6 Further developments 


Readers are invited to develop this case study to increase their understanding. Try adding 
a second information signal modulating the second carrier wave, and extract both signals 
after ‘transmission’. Also add more carrier waves and modulating signals, and investigate 
signal recovery. If information signal bandwidths are limited to a fixed value, how 
many signals can be transmitted and recovered satisfactorily? What happens if T is 
altered? Can the ‘ringing’ effect be reduced by smoothing the transition from the string 
of ones to the string of zeros in the filter masks? Seek references to various window 
functions in signal-processing texts to assist in resolving this question. 


8.9 Engineering application: «BITE TR design of digital filters 


and windows 


This application section provides a brief introduction to some methods of digital filter 
design. In particular we introduce a transform based on the Fourier transform itself, rather 
than going via the exponential form of Fourier series and the underlying periodicity 
implications. The material contained in this section first appeared in Signal Processing 
in Electronic Communication by M. J. Chapman, D. P. Goodall and N. C. Steele, originally 
published in the Horwood Series in Engineering Science in 1997 and is reproduced by 
courtesy of the current publishers Woodhead Publishing Limited. 


8.9.1 Digital filters 


Suppose /(f) is a signal with Fourier transform F(jq@) so that 
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Ад) = x | F( joe do (8.112) 


-оо 


If we now sample f(t) at times t= kT, ke Z, we obtain the sequence {f,} = {f(kKT)} and 
(8.112) gives 


f= = | F(jo)e"" do (8.113) 


Splitting this infinite interval of integration into intervals of length 27/7, we obtain 


2n 
+1)= 
ч т 


Е x Fjo) do 


== ; 2x 
_ 1 е T . 2n Mon) kr 
-i$ | воно) "7 "ao 


j(o*n2T)kT 
since E MT elokTeinkz — elk As usual, we do not attempt to give conditions 
under which the above interchange between an integral and an infinite sum is valid. In 
any case, the above is only intended as a formal procedure leading to a definition for 
the discrete-time Fourier transform. 


If we now let 0 be the normalized frequency 0 — cT and set 


j ly .( 0-- n2 
F(e^) - + Ff Ger) 8.114 
зе у к= (8.114) 
where we note that the right-hand side has period 27 in 0, we obtain 
2n 
Jue cts | F( ef)e/*qg (8.115) 
2n б 


The periodic function, F(e!°), is referred to as the discrete-time Fourier transform 
(DTFT) of the sequence {f,}. Equation (8.114) is unsuitable for calculation of the 
DTFT and so we instead use (8.115) to define the inverse DTFT and invert this in order 
to define the direct transform. We claim that the DTFT is, in fact, given by 


F(e*)e Y feo Y fe) (8.116) 


n-—oo n-—oo 


Note that this is the same as the transform defined in (6.1), which is known as the 
bilateral z transform of {f} reflecting the fact that it is defined for both positive and 
negative values of the time index X, evaluated at z — e/?. This fact also explains the use 
of the notation, F(e!*). To show that (8.116) is valid, we substitute into the right-hand 
side of (8.115) to give 
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2n 2n 
1 jo AjK0. 1 = —jn04jk0. 
ы: F(ei?)e?qg - — jnesj 
i| (е )е | > fpe medo 


0 


=—оо 
о” 


27 
_— уг 00-9040 
X4 | E 


n-—oo 


assuming the interchange of summation and integration 1s permissible. However, it is 
easy to see that 


2n 
1 emedo = 0, forkzn i 5, 
a 1, for k =n n 


0 


and so the right-hand side of (8.115) reduces to 


оо 


У Лб» = fp 


п=—оо 


as desired. To summarize, we have the two equations 


DTFT Ае") = У је? (8.1172) 
К=—оо 
2л 
IDTFT f= x | F(ei?)e/*dg (8.117b) 
T 
0 


Example 8.22 Calculate the discrete-time Fourier transform of the finite sequence 


{u} = {1, 2, 2, 1} 


Solution We adopt the convention that the above sequence is ‘padded-out’ with zeros, that 
is we have {uw} = {u,} where uw) = u; = 1, u, 2 u; = 2 and u, = 0 otherwise. It follows 
from (8.117a) that 


Оеі®) = 1 + 267) + 267210 + ег 


3 


-jol 3j6 — ijo ljo -lje 
=e’ (Aat te? | 
E 30 0 
=е? |2 cos( 38) +4 o (2) 
2 2 


A sketch of |U(e/?)| 2 |2 sos (28) +4 соз(5) is given in Figure 8.40. |U(e?)| is called 


the amplitude spectrum of the sequence (uj. Figure 8.40 clearly shows the periodicity 
of |U(e/?)|, which by now is not surprising. In a similar fashion, we refer to arg U(e?) 
as the phase spectrum of (uj. 
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Figure 8.40 
Amplitude 
spectrum |U(e*)| 
for Example 8.22. 


0] 





We are now in a position to develop a direct approach to the design of digital filters 
based on a Fourier series approach. Suppose that D(z) is the transfer function of a stable 
discrete-time system, then, we can write as usual, 


Y() - DEUE) 


If the input sequence is {ur} = {6;} = {1, 0, 0, 0, ...}, the unit impulse sequence with 
z-transform U(z) = 1, then the transform of the output sequence, namely the impulse 
response sequence, is 


Yío)-DGg)- Vd" 


Since the system is stable, by assumption, there is a frequency response which is 
obtained by taking the DTFT of the impulse response sequence. This is achieved by 
replacing z by ei in D(z) to obtain 


D(e!®") = Die!) = » dem (8.118) 
п=0 
where 0= aT. 
Now (8.118) can be interpreted as the Fourier expansion of D(e/?^), using as basis 
functions the orthogonal set {e?”*}. It is then easy to show that the Fourier coefficients 
relative to this base are given by 


d | D(ei?jei"? do 
2n " 

We now set D(e!”) to the desired ideal frequency response function and calculate the 
resulting Fourier coefficients, (/;(1)) say. It should be noted that, at this stage, we can 
no longer restrict to n = 0, i.e. ha(n), as defined above, is not causal and hence does not 
correspond with the impulse response of any realizable system. If a filter is to be real- 
ized using a finite number of delay elements, some form of truncation must take place. 
It is helpful to think of this truncation being performed by the application of a window, 
defined by a window weighting function w(n). The simplest window is the rectangular 
window, with weighting function w(n) defined by 


l, -n sSnsm 
w(n) = | 
0, otherwise 
Using this window, we actually form 


У ми) = Shine" = Deo”) 


n-—o 
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Example 8.23 


Solution 


where, if nı and n) are sufficiently large, D(e/^) will be an adequate approximation to 
D(e!*), the desired frequency response. It is important to note that the filter length, 
that is the number of delay elements or terms in the difference equation, depends on the 
choice of n, and m. This means that some accuracy will always have to be sacrificed in 
order to produce an acceptable design. 

We explore this technique by designing a low-pass filter in Example 8.23. 


Use the Fourier series, or direct design method, to produce a low-pass digital filter with 
cut-off frequency f: = | kHz, when the sampling frequency is fs = 5 kHz. 


We wish to make use of the non-dimensional frequency variable 0 and, since T= 1/f; = 
1/5000, we have 


0- oT - 2nfT- 2H. 
5000 


The cut-off frequency is then 0, = 27f./5000 = 27/5 and the ideal frequency response 
D(e®) is now defined by 


D(ef*) 1 |Ө| < 2n/5 
0  |0| » 2n/5 


We now calculate the coefficients h,(n) as 


= | Dleed 
20} 


T 


2n/5 
-X| emda 
27 
-2n/5 


= еш а) forn #0 

пт 5 

= 2sine{ 24) (also valid for n = 0) 

At this stage, we have to choose the length of the filter. By now, we know that a ‘long’ 
filter is likely to produce superior results in terms of frequency domain performance. 
However, experience again tells us that there will be penalties in some form or other. 
Let us choose a filter of length 9, with the coefficients selected for simplicity as sym- 
metric about n = 0. As already discussed, this choice leads to a non-causal system, but 
we deal with this problem when it arises. This scheme is equivalent to specifying the 
use of a rectangular window defined by 


б) 1 -4<nsx4 
w(n) = 
0 otherwise 


We now calculate the coefficients /;(—4), h4(—3), . . . ha(0), . . . A; (4), which are tabulated 
in Figure 8.41. 
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Figure 8.41 
Coefficients ha(k), 
fork 2—4, 3,...,4. 


Figure 8.42 Amplitude 
response of the 
non-causal filter of 
Example 8.23. 


аА) ha(+3) ha(#2) hal) ha(0) 
-0.07568 0.06237 0.09355 0.30273 0.40000 


The transfer function of the digital filter is then D, where 


D(z) = У; ћа(п)2" 


— —0.07568z ^ – 0.062372 + 0.093552? + 0.302732! + 0.40000 
+ 0.302732 + 0.0935522 – 0.062372? – 0.075682“ 


Although this system is indeed non-causal, since its impulse response sequence con- 
tains terms in positive powers of z, we can calculate the frequency response as 


D(ei?) 2 —0.15137 cos(40) — 0.12473 cos(30) + 0.18710 cos(26) 
+ 0.60546 cos(0) + 0.40000 


Figure 8.42 illustrates the corresponding amplitude response. 








2n/5 т 2n 0 


Figure 8.42, of Example 8.23, shows us that the amplitude response of our filter is a 
reasonable approximation to the design specification. We do, however, notice that there 
are some oscillations in both pass- and stop-bands. These are due to the abrupt cut-off 
of the rectangular window function and the effect is known as Gibbs’ phenomenon. 
Thus, the window function generates additional spectral components, which are referred 
to as spectral leakage. A way of improving the performance in this respect is discussed 
in Section 8.9.2. The immediate problem is the realization of this non-causal design. To 
see how we can circumvent the difficulty, we proceed as follows. 

The transfer function we have derived is of the general form 


D(z) = У һ00=* 


k=-N 


=2z" [h,-N) th(-N+l)z'+...+h,0)2%+...+A(N)2] 
Suppose that we implement the system with transfer function 
D(z) » z"D(z) 


which is a causal system. First we notice that, on setting z = e/^^, the amplitude response 
ID(ei"")| is given by 


DEY = eT] Be] = Bey 
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Figure 8.43 A 
realization of the 
final system of 
Example 8.23. 


8.9.2 


Figure 8.44 
Rectangular window 
sequence. 


that is, it is identical with that of the desired design. Furthermore, 
arg{D(e!”")} = arg{D(ei®")} — NoT 


indicating a pure delay of amount NT in the response of the second system. This means 
that, assuming we are prepared to accept this delay, our design objective can be met by 
the system with transfer function D(z) given by 


D(z) = [-0.07568 — 0.062377! + 0.093557 + 0.30273z° + 0.40000z4 
+ 0.302737” + 0.093557% — 0.062377” — 0.075687 °] 


It is evident from Figure 8.43 that the filter designed in Example 8.23 differs from the 
previous designs. The nature of this difference is the absence of feedback paths in the 
block diagram realization of Figure 8.43. One effect of this is that the impulse response 
sequence is finite, a fact which we already know, since the design method involved 
truncating the impulse response sequence. Filters of this type are known as finite 
impulse response (FIR) designs and may always be implemented using structures not 
involving feedback loops. Another name used for such structures is non-recursive, but 
it is not correct to assume that the only possible realization of an FIR filter is by use of 
a non-recursive structure; for details see M. T. Jong, Methods of Discrete Signals and 
Systems Analysis, McGraw-Hill, New York, 1982. 
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а = —0.07568, b — —0.06237, c — 0.09355, d — 0.30273, f — 0.40000. 


Windows 


In this section, we consider the problem identified in Example 8.23 in connection with 
the sharp cut-off of the rectangular window function. 
The rectangular window sequence, illustrated in Figure 8.44, is defined by 


= N 
n=l ASA 
0 otherwise 


wh) 


1 
өөөөөөөфөөөөөө ө 
— —— 
-N 0 N k 





716 THE FOURIER TRANSFORM 


Figure 8.45 DTFT of 
the rectangular window 
sequence. 


which can be expressed in the form 
w(k) 2 Gk  N) - Gk — (QN + 1)) 


where ¢(k) = {h(k)}, defined in Example (6.22). 
Since 


N+} -(N+}) 

-(N+1 z il.z =z 
WE) = z"-z (Z) S 
z-z’ 


the DTFT of the sequence {w(k)} is 
sin(;(2N * 1)0) 


We!) = — for 020 
sin(; @) 
_ QN+ I)sinc(2(2N + 1)@) 
7 sinc(18) 
I N 
It is easy to see that (е!) = W(1) = X w(n) — 2N + | and so the above formula, using the 
n--N 


sinc function, is valid for all 0, including 0 — 0. The graph of this function is illustrated 
in Figure 8.45. The first positive (negative) zero in its spectrum is the positive (negative) 
value of 0 closest to zero such that W(e!®) = 0. The main lobe of the window function 
is that part of the graph of W(e^) that lies between the first positive and first negative 
zero in W(ei^). The main lobe width is the distance between the first positive and 
negative zeros in W(e!”). As the length of the window increases, the main lobe narrows 
and its peak value rises and, in some sense, W(e!°) approaches an impulse, which is 
desirable. However, the main disadvantage is that the amplitudes of the side lobes also 
increase. 


W (e) 





The use of any window leads to distortion of the spectrum of the original signal caused 
by the size of the side lobes in the window spectrum and the width of the window's 
main spectral lobe, producing oscillations in the filter response. The window function 
can be selected so that the amplitudes of the sides lobes are relatively small, with the 
result that the size of the oscillations is reduced; however, in general, the main lobe 
width does not decrease. Thus, in choosing a window, it is important to know the trade- 
off between having narrow main lobe and low side lobes in the window spectrum. 

A considerable amount of research has been carried out, aimed at determining suitable 
alternative window functions which smooth out straight truncation and thus reduce 
the Gibbs’ phenomena effects observed in the amplitude response of Figure 8.42. 
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Figure 8.46 Some 
popular window 
functions. 


Example 8.24 


Solution 


Figure8.47 Hamming 
window coefficients 
for -4 < k x 4. 


To minimize the effect of spectral leakage, windows which approach zero smoothly at 
either end of the sampled signal are used. We do not discuss the derivation of the various 
window functions, rather we tabulate, in Figure 8.46, some of the more popular exam- 
ples in a form suitable for symmetric filters of length 2N + 1. For a more detailed 
discussion on windows and their properties, see, for example: E. C. Ifeachor and B. W. 
Jervis, Digital Signal Processing: A Practical Approach, Addison-Wesley, Wokingham, 
UK, 1993; A. V. Oppenheim and R. W. Schafer, Discrete-time Signal Processing, Prentice- 
Hall, Englewood Cliffs, NJ, 1989; S. J. Stearns and D. R. Hush, Digital Signal Analysis, 
Prentice-Hall, Englewood Cliffs, NJ, 1990. 


Window name w(K) 


Bartlett (k - NYN -Nxk«0 
C T3233 
von Hann or Hanning w(Kk) = 0.5 -- 0.5 cos(nk/(N -- 1)) -N-Kk-N 
Hamming w(k) = 0.54 + 0.46 cos(1k/N ) -М= = № 
Blackman w(Kk) = 0.42 + 0.5 cos(nk/N ) - 0.08 cos(2n/N ) -М= к= № 








In each case, w(k) — 0 for k outside the range [-N, N]. 


Note: Slight variations on the above definitions may be found in various texts. These 
tend to involve switching between ‘division by M’, ‘division by № + 3’ and ‘division 
by N + 1’. For example, the von Hann or Hanning window is variously defined by 
w(k) = 0.5(1 + cos(1tk/N)) or w(k) = 0.5(1 + cos(2tk/(2N + 1))) or w(K) 2 0.5(1 + соѕ(л// 
(N + 1))) for |A| = N with w(k) = 0 for |k| > N. The Bartlett window, or one of its 
variations, is sometimes referred to as a triangular window. It should also be noted 
that both the Bartlett window and the Blackman window, as defined in Figure 8.46, 
satisfy w(-N) — w(N) — 0 and hence give rise to difference equations of order 2N — 2 
rather than 2N. 

Formulations for other configurations can easily be deduced, or may be found in, 
for example, L. B. Jackson, Digital Filters and Signal Processing, Kluwer Academic 
Publishers, Boston, MA, 1986; R. E. Ziemer, W. H. Tranter and D. R. Fannin, Signals 
and Systems, Macmillan, New York, 1983. The section closes with an example of the 
application to the design of Example 8.23. 


Plot the amplitude response for the filter design of Example 8.23, using (a) the 
Hamming window and (b) the Blackman window. 


(a) The transfer function coefficients are now given by h,(k) wy(k), where w,(k) are 
the Hamming window coefficients, calculated with N = 4 and —4 < k < 4. The 
Hamming window coefficients are tabulated in Figure 8.47. 


N=4 +4 #3 +2 +] 0 
0.08000 0.21473 0.54000 0.86527 1.00000 
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Figure 8.48 
Amplitude response of 
the filter of Example 
8.23, with Hamming 
window. 


Figure 8.49 
Blackman window 
coefficients for 

-4 x k x 4. 


Figure 8.50 
Amplitude response 
of the filter of 
Example 8.23, with 
Blackman window. 





(b) 


{ | i- 
2/5 т 2т 0 


The transfer function then becomes 


D,,(z) ^ [-0.00605 — 0.01339z ! -- 0.05052z? 4 0.26194z? 4- 0.40000 * 
*- 0.26194z? 4- 0.050522 0.013392 – 0.006052] 


The frequency response is then obtained by writing z = ef, as 


D,(ei?) = e° (-0.01211 cos(46) – 0.02678 соѕ(30) + 0.10103 соѕ(20) 
+ 0.52389 cos(6) + 0.40000) 


Figure 8.48 illustrates the magnitude of this response and the reduction of 
oscillations in both the pass- and stop-band is striking. The penalty is the lack 
of sharpness near the cut-off frequency, although the stop-band characteristics 
close to 0 — v are quite good. 


Proceeding as in case (a), we calculate the Blackman window coefficients as 
shown in Figure 8.49. The Blackman windowed transfer function is thus 


D,(z) = —0.00414 + 0.031817! + 0.234187? + 0.400002 + 0.23418z* 
+ 0.031812 — 0.004147“ 


+4 +3 +2 +1 0 
0.00000 0.06645 0.34000 0.77355 1.00000 


and the frequency response is found as 


D(ei?) = &:9* (—0.00829 cos(30) + 0.06361cos(20) 
+ 0.46836 cos(0) + 0.40000) 


The amplitude response is shown in Figure 8.50 and this design again suffers 
from a relatively poor performance in terms of sharpness of cut-off. The ripples 
observed in the pass- and stop-bands with the rectangular window have been 
removed as before. However, the ‘flat’ characteristic of the Hamming design close 
to Q2 m is not evident when using the Blackman window for this particular filter. 
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8.9.3 Exercises 


32 Use the direct design method with a rectangular 0 
window of length 11 to produce a causal 
low-pass filter with non-dimensional cut-off 
frequency 


m 
2 


Plot the frequency response. 


33 Repeat Exercise 32 but use a Hamming window. 


8.10 Review exercises (1-25) 


1l Calculate the Fourier sine transform of the causal 5 Establish the demodulation property, 
function f(t) defined by ОО ОЛ Оо о, 
© (0=т=1) -iF(jo)-i[F(jo-2jo)- F(jo 2j0)] 
ОЛО lese ee 


6  Usetheresult Z(H(t-- T) — H(t- T)) 22Tsinc oT 
and the symmetry property to show that 


94 5іпс г} = п[Н(0 + 1) – Н(о – 1)] 


0 (722) 


2 Show that if F{ A} = F(jæ) then Z(f(-0)) — 
E F(-jq@). Show also that Check your result by use of the inversion integral. 


S f(-t - a)) = e°F(-jo) 7 Fora wide class of frequently occurring Laplace 
transforms it is possible to deduce an inversion 
integral based on the Fourier inversion integral. 
If X(s) 2 S£(x(t)) is such a transform, we have 


where a is real and positive. 
Find 9 f(t)) when 


=w ==; t 
f(ü-1imt (-2<t<2) x(t) = Z| Rye" ds 


n2) qie 


1 
2 
where Re (s) = y, with y real, defines a line in the s 
plane to the right of all the poles of X(s). Usually 


the integral can be evaluated using the residue 


3 Use the result 


FLA(t + 37) - H(t- ir) - T sinc Ton theorem, and we then have 
and the frequency convolution result to verify that x(f) 2 X residues of X(s) e* at all 
the Fourier transform of the windowed cosine poles of X(s) 
function (a) Write down the poles for the transform 


fO = cos egt [H(t - 1T) - H(t — 1T)] 1 
(s- a)(s- b) 

where a and b are real. Calculate the residues of 
X(s) e* at these poles and invert the transform. 


X(s) = 
is 


IT [sinc } (@-— @,)T + sinc 1 (v 4- @)T] 





4 Show that (b) Calculate 
ó(t — t) *ó(t — t) — ó(t — (t * b)) (i) e 1 | G 1 | 
and hence show that (s= 2) s(st+1) 
F{cos Wt H(t)} = | nislo + a) + lo- @)] (c) Show that 
асо s 2| e nini 
2 2 2 2 
o- w (s +1) 
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A linear system has impulse response A(t), so that 
the output corresponding to an input u(f) is 


VOS | A(t- T) u(t) dt 


When u(t) = cos Wot, y(t) = —SiN Wot (My = 0). 
Find the output when u(t) is given by 


(a) cosœ(t+ in) (b) sin @,t 
(c) e^ (d) e^ 


This system is known as a Hilbert transformer. 


In Section 8.5.1 we established that 


exu = 15рп(/) 


where sgn(?) is the signum function. Deduce that 


13 
Fsgn(t)} = = 


and use the symmetry result to demonstrate that 


si = jsgn(o) 
П 14 


The Hilbert transform of a signal f (f) is 
defined by 


1 m 
Рых) 2 3€ Cf0)) - 1 LO ar 
T = 
Show that the operation of taking the Hilbert 
transform is equivalent to the convolution 


-l.f0 15 
Tí 


and hence deduce that the Hilbert-transformed 
signal has an amplitude spectrum Fy;(j@) 
identical with f(t). Show also that the phase of 
the transformed signal is changed by +1 T, 
depending on the sign of c. 


Show that 


Li 
(£ +a°)(t- x) 








= = IA 5+ = ==) 
х +а “Ё +а t-x ta 


Hence show that the Hilbert transform of 





Ar pe 0 


Да 


15 


а 





2 2 
X ta 


If Fy(x) 2 2€( f(t)) is the Hilbert transform 
of f(t), establish the following properties: 
(a) Hifa + t} = Fue + a) 

(b) 2C(f(at)) — Fu(ax) (a > 0) 

(c) H#{f-at)} =—Fy(-ax) (a > 0) 


(d) lar = £ Рыб) 


(е) ЖДО} = хб) + 1 | f(t) dt 


Use Exercises (9) and (10) to deduce the inversion 
formula 


fo- -1| Ры(х) yy 
T DH EXEC 


Define the analytic signal associated with the real 
signal f(t) as 


FAD = IO) — JF) 


where F(t) is the Hilbert transform of f(t). Use 
the method of Review exercise 13 to show that 


~ (@> 0) 


FA f(t) } = F,(j@) = 
(@<0) 
Use the result #{A(t)} = 1/j@ + 16(@) and the 
symmetry property to show that 


gi E AR 
F {H(@)} = 50(t) + D 


(Hint: H(—0) «1 — H(o).) 

Hence show that if f(t) is defined by ¥{f(t)} = 
2H(o)F( jo) then f(t) = f(t) — jF y(t), the analytic 
signal associated with f(t), where F(j@) — 9 f(t)) 
and Fait) — 261 f(t)). 

If f(t) = cos Wot (Wy — 0), find Z(f(r)) and 
hence f(t). Deduce that 


H{COS Wot} = —Sin Wot 


By considering the signal g(t) = sin Mot (@) > 0), 
show that 


HC{ SiN Wot} = COS Wot 


16 


ip 


18 


A causal system has impulse response A(t), 
where /(f) 7 0 (t < 0). Define the even part 
h(t) of h(t) as 


ht) = 5 [h@ +h] 
and the odd part /,(t) as 
h(t) = 3 (h(t) – 0] 
Since h(t) = 0 (t < 0) deduce that 
AÐ = sn DAD 
and that 
Й(ї) = AÀ) + sgn (OA(A | for all t 


Verify this result for А(0) = sin t H(f). Take the 
Fourier transform of this result to establish that 


H(jo) 2 Ho) * j2t(H.Go)j 


Let A(t) — e "H(f) be such a causal impulse 
response. By taking the Fourier transform, deduce 
the Hilbert transform pair 


а 5 
at ЖКО т Оо 
a +t а +х 


Use the result 








ЖИД} = x3 C0) 1) f(t) dt 


to show that 
4 m = E 2 
a tf x +a 


The Hartley transform is defined as 





KOSH = | f (t) cas 2nst dt 


where cas t — cost + sint. Find the Hartley 
transform of the functions 


(а) f()-e"H() (a 0) 


QE ЕЛ) 


(6) 0 = 
| | t s T) 


An alternative form ofthe Fourier transform pair is 
given by 


Füp) - | Де dr 


20 


21 


22 
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g(t) = | GG p) e?" d 
where the frequency p is now measured in hertz. 
Define the even part of the Hartley transform as 
E(s) = 3 [Fuls) + Fu(-s)] 
and the odd part as 
O(s) = 5 [Fu(s) -— Fu(-s)] 
Show that the Fourier transform of f(t) is given by 
F(jp) = E(p) - jO(p) 
and confirm your result for f(t) 2 e "H(r). 
Prove the time-shift result for the Hartley 
transform in the form 
H{ At- T)} = sin(27Ts) Fas) 
t cos(2nTs) Fy(s) 


Using the alternative form ofthe Fourier transform 
given in Review exercise 18, it can be shown that 
the Fourier transform of the Heaviside step 
function is 


ж н(ї}= —— +1б(р) 
JpT 
Show that the Hartley transform of H(f) is then 
19(5) + L 
ST 
and deduce that the Hartley transform of 
H(t - 1)is 


La(s) + ©0575 - sin ts 
2 ST 


Show that H{ô(f)} = 1 and deduce that 
H{1} = ô(s). Show also that H{ô(t — t)} = 
cas 27st, and that 


Н{саѕ 275} = Н{соѕ 275} + Н{ ѕіп 275,7) 
= (5 – 50) 

Prove the Hartley transform modulation theorem 
in the form 

H(/f(t) cos 2st} = р Fals — So) + 1 F(s + So) 
Hence show that 

Н{соѕ 275} = з [б(5 — 5) + (5 + 5)] 

Н{ѕіп 27500) = 1 [9(5 — 59) — б( + 5)] 


722 THE FOURIER TRANSFORM 


DS 


24 


25 


Show that 


-|®| 
FI tant} = Де. 


jo 


t 
c Consider | (1+ p 2 


Show that 
x(t) = 1 (14 cos ay) [H(t - 1T) - H(t — 1T)] 
has Fourier transform 


T[sinc o 5 sinc(Q — 0x) + 1 sinc(@+ @)] 


The discrete Hartley transform of the sequence 
{ f(r) #44 is defined by 


N-1 
H(v) = FOI cas Ga 
i) 


(= ОЗУ) 


The inverse transform is 


N-1 


Y Hv) cas (222) ОЕ EAM 


Show that in the case N — 4, 
H-Tf 
H-[H(0 H() H2) H3 
S= fü) ЛО) У)" 


їкї їл 
pelo E 
e NT a 
TTE TNNT 


Hence compute the discrete Hartley transform of 
the sequence (1, 2, 3, 4). Show that T? - 1l and 
hence that T ' — 4T, and verify that applying the 
T` operator regains the original sequence. 


Yo 
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Introduction 


In Chapter 5 we considered the role of ordinary differential equations in engineering. 
However, many physical processes fundamental to science and engineering are governed 
by partial differential equations, that is equations involving partial derivatives. The 
most familiar of these processes are heat conduction and wave propagation. To describe 
such phenomena, we make assumptions about gradients (for instance, the Fourier law 
that heat flow is proportional to temperature gradient) and we write down balance 
equations; partial differential equations are thus produced in a natural way. Unless the 
situation is very simple, there will be many independent variables, for example a time 
variable t and a space variable x, and the differential equations must involve partial 
derivatives. 

The application of partial differential equations is much wider than the simple 
situations already mentioned. Maxwell's equations (see Example 3.16) comprise a set 
of partial differential equations that form the basis of electromagnetic theory, and are 
fundamental to electrical engineers and physicists. The equations of fluid flow are partial 
differential equations, and are widely used in aeronautical engineering, acoustics, the 
study of groundwater flows in civil engineering, the development of most fluid handling 
devices used in mechanical engineering and in investigating flame and combustion 
processes in chemical engineering. Quantum mechanics is yet another theory governed 
by a partial differential equation, the Schródinger equation, which forms the basis of 
much of physics, chemistry and electronic engineering. Stress analysis is important in 
large areas of civil and mechanical engineering, and again requires a complicated set of 
partial differential equations. This is by no means an exhaustive list, but it does illus- 
trate the importance of partial differential equations and their solution. 

One of the major difficulties with partial differential equations is that it is extremely 
difficult to illustrate their solutions geometrically, in contrast to single-variable problems, 
where a simple curve can be used. For instance, the temperature in a room, particularly 
if it is time-varying, is not at all easy to draw or visualize, but such information is of 
crucial importance to a heating engineer. Modern graphics packages have improved the 
situation considerably in two and three dimensions and the displays can often give a 
good qualitative understanding. A second basic problem with partial differential equations 
is that it is intrinsically more difficult to solve them or even to decide whether a solution 
exists. The driving force of most physical systems that can be modelled by partial dif- 
ferential equations is determined by either what happens on the boundary of the region 
under consideration or how the system is started at zero time. Boundaries, therefore, play 
a very significant role, and we shall see that a problem can have a solution for one set 
of boundary conditions but not for another. Finding a solution to a partial differential 
equation is often quite straightforward but finding the solution that fits the boundary 
conditions is very difficult. 

The solution of partial differential equations has been greatly eased by the use of com- 
puters, which have allowed the rapid numerical solution of problems that would otherwise 
have been intractable. Such methods have generally been integrated into this chapter, 
since they are now one of the standard techniques available. However, the finite-element 
method is considered separately, since it is more complicated, and requires a lot of 
careful thought and work (the section dealing with it can be omitted on a first reading). 
The finite-element method originated in stress analysis in civil engineering work, but 
has now spread into most areas where complicated boundaries are encountered. 
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There are three basic types of equation that appear in most areas of science and 
engineering, and it is essential to understand their solutions before any progress can be 
made on more complicated sets of equations, nonlinear equations or equations with 
variable coefficients. 


ME General discussion 


The three basic types of equation are referred to as the wave equation, the heat- 
conduction or diffusion equation and the Laplace equation. In this section we briefly 
discuss the formulation of these three basic forms, and then consider each in more detail 
in later sections. The various sections will concentrate on finding and understanding 
solutions of the three types of equations in simple regions. The treatment of advanced 
methods, more complicated equations and other regions will be left to more compre- 
hensive books on partial differential equations (see, for example, R. Haberman, Applied 
Partial Differential Equations, Prentice Hall, Upper Saddle River, NJ, 2003). 


9.2.1 Wave equation 
1 д?и ^u u 
Ge bag ae” (9.1) 


Many phenomena that involve propagation of a signal require the wave equation (9.1) 
to be solved in the appropriate number of space dimensions. Perhaps the simplest, in 
one space dimension, is the vibration of a taut string stretched to a uniform tension T 
between two fixed points as illustrated in Figure 9.1(a), where u is the displacement, 
x is measured along the equilibrium position of the string and t is time. Applying 
Newton's law of motion to an element As of the string (Figure 9.1b), for motion in the 
u direction, we have 


net force in u direction = mass element x acceleration in u direction 
that is, 


2 
Tsin(y-- Ay) - Tsin y— pasti (9.2) 


Displaced position of string 





Equilibrium position of the string 


(a) (b) 


Figure 9.1 Displacement of an element of a taut string. 
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where p is the mass per unit length of the string. Neglecting terms quadratic in small 
quantities and using the Taylor series expansions 


cos Ay = 1+ O(Ay’) = 1, sinAy - Ay O(Ay?) = Aw 


and the expression 


e (s 


for the arclength, (9.2) becomes 
Tsi RN Qu 
sin y t Tcos yAy — Tsin y — pex 
t 


or 
2 

T cos y AV = par 
Ax дї 


which in the limit as Ax — 0 becomes 
2 
Posy бер 2и (9.3) 
Әх дї 
Again assuming that y itself is small for small oscillations of the string, we have 
cos y = 1, апа ће gradient of the string 


ди = tan oe 
De у= у 


and hence from (9.3) we obtain 


Ou Ou 
ia 
дх дї 


Thus the displacement of the string satisfies the one-dimensional wave equation 


2 2 
=. (9.4) 

с 00 Ox 
and the propagation of the disturbance in the string is given by a solution of this 
equation where c? — T/p. 


By considering the theory of small displacements of a compressible fluid, sound 
waves can likewise be shown to propagate according to (9.1). The one-dimensional 
form (9.4) will model the propagation of sound in an organ pipe, while the spherically 
symmetric version of (9.1) will give a solution for waves emanating from an explosion. 
Because it is known that most wave phenomena satisfy the wave equation, it is reason- 
able, from a physical standpoint, that the propagation of electromagnetic waves will 
also satisfy (9.1). A careful analysis of Maxwell's equations in free space is required 
to show this result (see Example 3.16). We could give further examples of physical 


Example 9.1 


Solution 


Example 9.2 
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phenomena that have (9.1) as a basic equation, but we have described enough here to 
establish its importance and the need to look at methods of solution. An aspect of the 
wave equation that is not often discussed is its bad behaviour. Any discontinuities in a 
variable or its derivatives will, according to the wave equation, propagate with time. An 
obvious physical manifestation of this is a shock wave. When an aircraft breaks the 
sound barrier, a shock is produced and the sonic boom can be heard many miles away. 
How the shock is produced is a complicated nonlinear effect, but once it has been 
produced it propagates according to the wave equation. 


Show that 


u-u sin(7=) соз Bet) 
CE L 


satisfies the one-dimensional wave equation and the conditions 
(a) a given initial displacement u(x, 0) 2 uy sin(rtx/L), and 


(b) zero initial velocity, au(x, 0)/at — 0. 


Clearly the condition (a) is satisfied by inspection. If we now partially differentiate u 


with respect to f, 
I (=) ; (=) 
sin| — |sin| — 
E L 


so that at / 2 0 we have ou/ot = 0 and (b) is satisfied. 
It remains to show that (9.4) is also satisfied. Using the standard subscript notation 
for partial derivatives, 


2 2 

ди ип . (пх псі 

ц = 1 5іп| = |соѕ| =— 
L 


ди _ une 


дї L 


ax’ I L 
Qu une. (=) (ze) 
Uy = —,————d1-Ssm| — | cos| — 
дї L L E 


so that the equation is indeed satisfied. 
This solution corresponds physically to the fundamental mode of vibration of a taut 
string plucked at its centre. 


Verify that the function 


— к 


satisfies the wave equation (9.4). Sketch the graphs of the solution u against x at t= 0, 
t= 2h/c and t= 4h/c. 
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Solution 


Figure 9.2 
Propagating wave 
in Example 9.2. 


9.2.2 


Evaluate the partial derivatives as 


us =2a(x ct) e - (2 _ ч) 
h h h 


= 2 
аса en : e exp - (: - ч 


апа 


ve [C] etie 0-0] 


2 
ied -2ac "E (z- er) |+ 4 Aa(x - m с? exp -(ž -& 
h? h h h h 


Clearly (9.4) is satisfied by these second derivatives. 
The curves of u against x are plotted in Figure 9.2, and show a wave initially centred 
at the origin moving with a constant speed c to the right. 





Heat-conduction or diffusion equation 


1 ди 
к дї 


This equation arises most commonly when heat is transferred from a hot area to a cold 
one by conduction, when the temperature satisfies (9.5). 

In Section 3.6 a full derivation of the equation (9.5) is made. Here we shall invest- 
igate the one-dimensional version in the context of the heat flow along a thin bar. The 
bar is assumed to have a uniform cross-sectional area and an insulated outer surface 
through which no heat is lost. It is also assumed that, at any cross-section x = constant, 
the temperature T(x, t) is uniform. Consider an element of the bar from x to x + Ax, 
where x is measured along the length of the bar, as illustrated in Figure 9.3. An 
amount of heat Q(x, f) per unit time per unit area enters the left-hand face and an 


= Уи (9.5) 


Figure 9.3 Heat flow 
in an element. 
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amount Q(x + Ax, f) leaves the right-hand face of the element. The net increase per unit 
cross-sectional area in unit time is 


Q(x, t) — Q(x + Ax, t) 


If c is the specific heat of the bar and p is its density then the amount of heat in the 
element is cpTAx. The net increase in heat in the element in unit time is 


cp Tax 


and is equated to the net amount entering. Thus 


cp Dax = Q(x, f) — O(x + Ax, t) 
which in the limit as Ax > 0 gives 
oT . 0Q 
ep Ot ox P 


The Fourier law for the conduction of heat states that the heat transferred across unit 
area is proportional to the temperature gradient. Thus 


2T 


Q= ox 


where k is the thermal conductivity and the minus sign takes into account the fact 
that heat flows from hot to cold. Substitution for Q in (9.6) gives the one-dimensional 
heat equation 


2 
or T 


9.7 
ot ax e 


where k= k/cp is called the thermal diffusivity. 

An entirely similar derivation for the diffusion equation can be made. The only dif- 
ference is that the Fourier law is replaced by Fick’s law that the diffusional flow of a 
material is proportional to the concentration gradient. 

The equations describing more complicated phenomena, such as the time-dependent 
electromagnetic equations or the equations of fluid mechanics, have the same basic 
structure as (9.5), but with additional terms or with coupling to other equations of the 
same type. We certainly need to know how to solve (9.5) before even contemplating 
solving these more complex versions. 

An essential feature of the heat-conduction equation is that, given a long enough 
time and assuming that there are no time varying inputs, the temperature will eventually 
settle down to a steady state. Thus the final solution is independent of time, and hence 
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Example 9.3 


Solution 


Example 9.4 


Solution 


will satisfy du/ot 2 0 or V?u — 0. The transient behaviour tells how this solution is 
approached from its given starting value. Physically it is reasonable that any initial 
temperature, however complicated, will move to a smooth final solution, and we should 
not expect the severe difficulties with discontinuities that occur with the wave equation. 
Exactly how initial discontinuities are treated in a numerical solution, however, can 
affect the accuracy in the early development of the solution. 


Show that 
T-T.-(T,—-T)e (x > Uñ 


satisfies the one-dimensional heat-conduction equation (9.7), together with the bound- 
ary conditions T — T., as x —^ œ and T = T,, at x = Ut. 


The second term vanishes as x — ee, for any fixed t, and hence T — T... When x = Ut, 
the exponential term is unity, so the 7.s cancel and 7 = 7,. Hence the two boundary 


m* 


conditions are satisfied. Checking both sides of the heat-conduction equation (9.7), 


l oT L -T T 
m оо K 


к дї 
oT р? -U(x-Ut)/K 
af (7,- Т.) е 

х к 


which are obviously equal, so that the equation is satisfied. 

The example models a block of material being melted at a temperature 7,,, with the 
melting boundary having constant speed U, and with a steady temperature 7., at great 
distances. An application of this model would be a heat shield on a re-entry capsule 
ablated by frictional heating. 


Show that the function 


T- 1. ехр (- — 
yt 4кі 
satisfies the one-dimensional heat-conduction equation (9.7). Plot T against x for various 
times ¢, and comment. 


We first calculate the partial derivatives 


oT. 11 - y. dex 1 ae 
Bo OX XD ә 
ot 2t Akt^ \t4k f Akt 


oT | 1-2x (==) 
ox jt 4Kt 4кї 


апа 





T l ехр( == )+ zx 2x ехр( ==) 
ox Kt” Akt^. 2xt" Akt 4кї 


Figure 9.4 Solution 
of the heat-conduction 
equation starting from 
an initial spike in 
Example 9.4. 


9,2.3 
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Temperature 
12 
10 
8 
6 
4 
2 

-4 -3 -2 -1 0 1 2 3 4 

Distance 
-——,.LzQd see QL x03; ---,L205; ——.Lzl. 


It is easily checked that (9.7) is satisfied except at the time t= 0, where T is not properly 
defined. The graph of reduced temperature 7/ (4k) against distance x at various times 
t = L’/4« can be seen in Figure 9.4. Physically, the problem corresponds to a very hot 
weld being applied instantaneously to the bar. The initial temperature ‘spike’ at x = 0 is 
seen to spread out as time progresses, and, as expected from the physical interpretation, 
T tends to zero for all x as the time becomes large. Alternatively the problem describes 
the diffusion of a large pulse of contaminant into a thin tube of fluid. 


Laplace equation 


Vu = 0 (9.8) 


The simplest physical interpretation of this equation has already been mentioned, 
namely as the steady-state heat equation. So, for example, the two-dimensional 
Laplace equation 
2 2 

i 22 zu) 

Ox oy 
could represent the steady-state distribution of temperature over a thin rectangular 
plate in the (x, y) plane. 


Heat transfer is well understood intuitively, and good guesses at steady-state solutions 
can usually be made. Perhaps less commonly understood is the case of the electrostatic 
potential in a uniform dielectric, which also satisfies the Laplace equation. Working out 
the electrical behaviour of a capacitor that is charged in a certain manner simply implies 
solving (9.8) subject to appropriate boundary conditions. Possibly the least obvious, 
but extremely important, application of the Laplace equation is in inviscid, irrotational 
fluid mechanics. To a large extent, subsonic aerodynamics is based on (9.8) as an 
approximate model. The lift and drag on an aerofoil in a fluid stream can be evaluated 
accurately from suitable solutions of this equation. It is only close to the aerofoil that 
viscous and rotational effects become important. 

The Laplace equation is a ‘smoother’ in the sense that it irons out peaks and troughs. 
Physically, the steady-state heat-conduction context tells us that if a particular point has 
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a higher temperature than neighbouring points then heat will flow from hot to cold until 
the ‘hot spot’ is eliminated. Thus there are no interior points at which the solution u 
of (9.8) is smaller or larger than all of its neighbours. This result can be confirmed 
mathematically, and establishes that smooth solutions are obtained, see Section 9.7.1. 


Example 9.5 Show that 
и = х* — 2х?у— 6x’y? + 2ху? + у* 


satisfies the Laplace equation. 


Solution Differentiating 
u, = 4х? – бх?у – 12ху? + 2у?, и, = 2х5 – 12х2у + бху? + 4y? 
и„ = 12x? — 12xy - 12y?, Uy, = —12x? + 12xy * 12y? 
so clearly 


Uy, + Uy, = 0 


and the two-dimensional Laplace equation is satisfied. 


Example 9.6 — Show that the function 


2 
—— 
x+y 





satisfies the Laplace equation, and sketch the curves y= constant. 


Solution First calculate the partial derivatives: 


_ 2хуЏа? 


х 


* (ё+уу 
2 2 2. 
y,- U- Us it 2: se 
x+y (x+y) 
__2уЏа? _ 8х?у0а? 
"o oqeyy Qxyy 


4 2yUd  , 4yUd' | 8y'Ua 


y (ty) (x+y)? (х2 + уг)? 


Substituting into (9.8) gives 


Yves 8yUa i 8y(x +y ) Ua? -0 
vana 2\2 2 gd = 
(x+y) (х +у) 


and hence the Laplace equation is satisfied. 


Figure 9.5 
Streamlines for flow 
past a cylinder of 
radius 1, from the 
Laplace equation in 
Example 9.6. 


E] 


9.2.4 
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Secondly, to sketch the contours, we note that y= 0 on y = 0 and on the circle 
x? + y? =a’. On keeping y — y, and letting x — +, the second term vanishes, so the 
curves tend to y= Uy. Figure 9.5 shows the solution, which corresponds physically to 
the flow of an inviscid, irrotational fluid past a cylinder placed in a uniform stream. 


Computer packages can verify the differentiations and the plotting in any of the 
examples in this section. For instance, the MAPLE instructions 
Еа 2457°2) )) 
Бао) У Е о Е оао) Я 
verify the Laplace equation in Example 9.6. The plotting in Figure 9.5 can be 
achieved from the instructions 


with (plots): 

SII A esq C OM MN 

аа Га OPE OO Gr OI С TS 
тосоо (о Е [E 3t Е) == шу = 2. 
scaling-constrained); 


Other and related equations 


We discussed in Section 9.1 applications in science and engineering. Many such applica- 
tions are governed by equations that are closely related to the three basic equations 
discussed above. For example, consider the equations of slow, steady, viscous flow in 
two dimensions, which take the form 


P lyy Ply 
oR oy ay” 
gu Qv _ 
de oy 


where u, v and p are the non-dimensional velocities and pressure, and 9? is the Reynolds 
number. The system has a familiar look about it, and indeed a little simple manipulation 
gives V?p = 0, so that the pressure satisfies the Laplace equation. If p can be calculated 
еп др/дх and dp/dy are known, so we have equations of the form 


(9.9) 
0, 
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V?u = f(x, y) (9.10) 


This equation is called the Poisson equation, and is clearly closely related to the Laplace 
equation. It can be interpreted physically as steady heat conduction with heat sources 
in the region. A careful study of the solution of the Laplace equation is required before 
either (9.10) or (9.9) can be attacked. In Sections 9.5 and 9.7 some discussions of the 
Poisson equation take place. 

If there is good knowledge about the time behaviour of the wave or diffusion equa- 
tion then we can often obtain important information from them without solving the full 
equations. For instance, if we put a periodic time dependence u =e!v(x, y, z) into (9.1), 
or if we put an exponentially decaying solution u = e v(x, у, z) into (9.5), then the 
variable v, in both cases, satisfies an equation of the form 


Vv +Av=0 (9.11) 


Equation (9.11) is called the Helmholtz equation, and plays an important role in the 
solution of eigenvalue problems. It is perhaps of relevance that the best studied eigen- 
value equation, the Schrédinger equation, is almost the same, namely 

k 


2 
8л т 





V?u — V(x, y, zju + Eu = 0 


It is a bit more complicated than (9.11), but it forms the basis of quantum mechanics, 
on which whole industries are built. 

So far, all of the equations that we have considered are linear, since they have not 
included any quadratic (or higher) terms in u or its derivatives. As soon as we move from 
linear to nonlinear problems, a whole new crop of theoretical and computational difficulties 
arises. Very few such equations can be solved analytically, and devising computational 
schemes is not easy. Even worse, mathematicians cannot always tell whether or not a solu- 
tion even exists. An act of faith is usually made by scientists and engineers that their problem 
is modelled correctly and therefore there must be a mathematical solution reflecting the 
physics. Often the faith is well founded, but modelling is an imperfect art and there are many 
things that can go wrong. It may be thought that nonlinear problems do not occur in practice, 
but this is certainly not the case. For some phenomena, like the behaviour of thermionic 
valves or avalanche semiconductors or pulsed lasers, it is the nonlinearity that produces 
the desired effects. Other situations arise where the nonlinearity of the system may or 
may not be important. For instance, the full steady two-dimensional fluid equations are 


„9и 4 pe = Ply 


ox oy ax R А 

90 90 Ф у 12 

“к oa" U (9.12) 
Ou , Ov _ 

dE OP 


where u, v, p and 9? are defined as for (9.9). These equations are nonlinear because of 
the presence of quadratic terms such as u ди /дх. It can be seen that (9.12) reduce to (9.9) 
for slow flow when quadratic terms are neglected. While (9.9) would be applicable to 
the flow of molten glass, we would need the full equations (9.12) to look at flow close 
to an aerofoil. Indeed, as 9? becomes large, the flow becomes turbulent, that is unstable, 
and the applicability of these equations comes into question. 


9.2.5 


Example 9.7 


Solution 
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Arbitrary functions and first-order equations 


In each of the examples in this section, a solution has been given; it has been checked 
that the solution satisfies the appropriate partial differential equation. In no case has the 
boundary condition been part of the specification of the problem, although in several 
cases boundary conditions were checked. In the next sections the boundary conditions 
are given as part of the set-up of the example. This is the natural way that a physical 
problem is specified and it proves to be a much tougher proposition. 

The most significant difference between ordinary and partial differential equations 
is the treatment of the ‘arbitrary constants’. Consider the examples: 


ODE PDE 

Solve the ordinary differential equation Solve the partial differential equation 

dy(f) 42 dz(x,t) _ 412 

=3t = 31 
dt ot 

Integrating gives Integrating gives 

у) =? + К z(x, f) 2 P 4 f(x) 
where K is an arbitrary constant, since where f(x) is an arbitrary function. 
differentiating y(f) with respect to t Differentiating with respect to ¢ produces 
produces 37 whatever the value of the 3f? for any function f(x) because x is kept 
constant K. constant in the partial differentiation. 


Extending this idea it can be seen that each partial integration introduces an arbitrary 
function into the solution. Sufficient conditions must be given to determine these arbit- 
rary functions. It is not always easy to decide exactly what conditions are required, but 
in subsequent sections an idea will be given for the three classic equations, the wave 
equation, the heat-conduction equation and the Laplace equation. An extended discus- 
sion can be found in Section 9.8. 

Consider for the moment a first-order equation. Such equations are of less interest in 
applications to engineering and science, but there is a comprehensive theory for their 
solution which will illustrate the use of arbitrary functions. 


Find the general solution, u(x, t), of the partial differential equation 


ди , Ou _ 
ot ә дх 


and find the particular solution when u(x, 0) = х2. 


0 


Change the variables z = x — t and T= ¢ and use the chain rule to evaluate the terms in 
the equation 


du _ Qudz, duaT __du , du 
ot ozot Tot oz OT 


du _ Qugz , дидТ _ du 
ox ozox OTOx oz 


736 PARTIAL DIFFERENTIAL EQUATIONS 


Figure 9.6 Surface 
z = f(x, y) showing the 
tangent, normal and 
characteristic curve C. 


Putting these differentials into the equation 


_ диди _ди 
pere ur 


Thus u(z, T) can be deduced as 
u(z, T) = f(z), where fis an arbitrary function 


Reverting to the original variables 


u(x, t) = f(x = t) 
and a general solution of the partial differential equation has been obtained. 
For the particular solution with initial conditions written in parametric form, x = s, 
t= 0, u=s", it is easily deduced that s? = f(s) and hence 


u(x, t) 2 (x — t 


The solution of quasi-linear first-order equations with two variables, x and y, is com- 
paratively straightforward 


P(x, y, jg tQ. y. 25. =R(x, y, z) (9.13) 


Provided P, Q and R are *well behaved' a method of solution can be deduced, although 
the resulting integrals cannot always be obtained explicitly. Extension to many variable 
problems is similar, but the geometrical interpretation is more difficult. 

In Section 3.2.1 it was seen that the function z = f(x, y), illustrated in Figure 9.6, has a 
normal (0z/dx, dz/Ay, —1) at a typical point M, having coordinates (x, y, z). Equation (9.13) 
says that the normal to the surface is perpendicular to the vector (P, Q, R) at the point 
M and thus (P, Q, R) must lie in the tangent plane. Now examine the curve C in the surface 
starting at the point A and moving along C in a direction that is always parallel to (P, 
Q, R) at the current point. The direction therefore remains perpendicular to the normal 
(dzldx, dzldy, —1) at all points and must move in a tangential direction to the surface; 
such curves are called characteristic curves. The point must remain in the surface and 
this tangential direction (dx, dy, dz) must therefore be parallel to (P, Q, R) so that 


dx dy dz 
Р О К 


(9.14) 


normal (z,, z,, —1) 





tangent (P, Q, R) 


Example 9.8 


Solution 
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Starting from (9.13) we have shown that z(x, y) can be obtained from the two ordinary 
differential equations (9.14). If we start from (9.14) we know that the normal direction 
(Oz/üx, az/dy, —1) is perpendicular to the tangent vector (dx, dy, dz) and hence perpen- 
dicular to (P, Q, R) so 


P(x, y, 2) +0(х, у, n% — R(x, y,z) =0 


and (9.13) is satisfied. 

From a particular starting point, x = a, y = b, z = c, the solution of (9.13) is obtained as 
the characteristic curve obtained from the solution of the ordinary differential equations 
(9.14). Usually there is a starting curve; then essentially the process calculates the charac- 
teristic curve from each point of the starting curve and the solution surface is generated. 

To illustrate the method return to Example 9.7 when the equations (9.14) become 


dt_ dx _ du 
1 |] 0 

using the variables given. The two ordinary differential equations are 
z. 1 with solution x-t=A 
dt 
du 


— —0  withsolution u-B 
dt 


The constants A and B are arbitrary and are determined from a given initial data point. 
Usually the initial data is given on a curve t= 0, x = f(s), y = g(s), so for each s there are 
arbitrary constants A and B; in this case, the constants depend on s, that is A(s) and B(s). 
In the current example (9.7) the initial data is ¢= 0, x = 5, u — s? giving 


S=A so x-t=s 
52= В 50 и= 5 


Eliminating s gives u = (x — f)? as deduced earlier. A further example shows how the 
method is applied. 


Solve the equation 


Oz, oz _ 
ET uo =xy 


for z(x, y) given that z = fs) when x = s and y = 1 — s. 


The two ordinary equations obtained from (9.14) are 
dx dy аа 9-4 (9.15a,b) 
x y х ху 


Solving (9.15a) gives In x = In y + C which reduces to x = Ay. Putting this result into 
(9.15b) gives 


xdx = Adz 


which on solving gives 
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102 42+ В 
2 
To obtain the arbitrary constants A and B we insert the initial conditions 

A II (9.162) 
y les 
12» x 12 5 

=з L 9.16b 
y yer (9-16b) 


Thus A and B have been obtained in terms of s. From (9.16a) we get 


X 03$ 


X 
and hence s = —— 


y | 1-5 x+y 
So that (9.16b) becomes 


12 х =H | z(= ) 
iy -1;22| L| -iy|t- 
2 y 2Ax ty y \x +y 


Rearranging, z is then calculated as 
1 l_ xy ( х ) 

2= =ху- = Ef |= 

2 2(x ryy x ty 


The solution can be checked by substitution into the original differential equation. 





(m) Because there is a comprehensive theory of the solution of some classes of first- 
order partial differential equations computer packages can be used to solve these 
equations with comparative ease. The MAPLE instructions 


with(PDEtools): 
БЕРБ: = eibi И (с Ул Е НЕЕ ЕЕЕ Е 
pdsolve (PDE) ; 











produce the general solution in Example 9.8. 
x 
Figure 9.7 Draining A practical example of first-order equations involves the draining of liquid from a vessel, 


of liquid down the side a procedure common to many industrial processes. The thickness of the liquid layer is 
of a vessel. required as time progresses. 


Example 9.9 A thin layer of liquid drains down the side of a vessel, as illustrated in Figure 9.7. From 
the theory of thin layers, the equation for the fluid motion is given by 


oh 20h 

— кай = = 

7 А дх 
where h(x, f) is the thickness of the layer and a is a constant that depends on the viscosity, 


density and the gravity constant. Find the solution for h(x, f) given the initial condition 
h(x, 0) = ax where a is a constant. 


0 


Solution The ordinary differential equations (9.14) are 


dr. dx dh 
l aè 0 
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Clearly h = C, a constant, solves one of the equations and the other is 


а -ah -aC  withsolution x=atC?+K 


where C and K are arbitrary constants. Thus the solution of the equation is 


ee 
K=x-ath 


Using MAPLE to try for a solution, the instructions 


with(PDEtools) : 
eine neas се) Ека ee te ДШ ЕЗ) Е) 
pdsolve (drain); 





give h as a solution of the equation 
fh) =x- atk 


Note that the package has combined the arbitrary constants C and K into one arbitrary 
function f, determined by the initial data. 

Clearly the package can go no further without the specification of the initial condi- 
tion. Putting in the conditions x = s, t = 0, h = as gives fas 


Хоц) = ог Лр) = (2) 


The function / can now be calculated as 


hes a | E 


1 t act 





1 
and shows that, for large t, the layer thins at a rate proportional to £ ? . The solution can 
be checked by direct substitution into the original equations. A plot of h against x at 
successive times or a three-dimensional plot of A(x, f) using PDEplot in MAPLE can 
be used to illustrate the solution. 





The MATLAB instructions that produce the surface shape, h/a, at successive times 
aa’t = 0,2,4,8 are 

y=070.1:4; 

Se exeuete (7) " z exeuete a таа 

plot (X,y) 


Further applications of first-order equations occur in the study of the time evolution of 
the probability distribution of the position of a particle, for instance in Brownian motion. 
The equation is 


ӘЙ) ‚д 2. 
- Í +ZID: Df 1 = 3106 tfe, 0] 


where D, and D; are drift and diffusion coefficients. If diffusion can be neglected then 
the equation is just a first-order partial differential equation. 
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In gas dynamics and also in traffic flow problems a similar equation, the Burgers’ 
equation, can be shown to apply 


ди ty% = уди 
ot ox Je 
In the two situations u is the gas velocity or the traffic density and v is a viscosity 
coefficient. For the inviscid case, again the equation reduces to a first-order partial dif- 
ferential equation. The derivation of these equations is lengthy and beyond the scope of 
this text but can be found in specialist books. 
The solution of inviscid Burgers’ equation is obtained from 


dt _ dx du 
1 и 0 
The two equations give one obvious solution u = A and the second is 
Ок = u=A andhence x=At+B 
Taking initial conditions t= 0, u = V(s) and x 2 s we obtain 
А=и = Ү(ѕ) 


B=x-ut=s 
Eliminating s gives the solution for u(x, t) in implicit form 
V(x- ut)=u 


9.2.6 Exercises 


qi Find the possible values of a and b in the az + 20z = l9 
expression d? ror ggf 
и = cos at sin bx Show that z(r, f) = 7" cos (r — ct) satisfies the 
such that it satisfies the wave equation equation (r + 0). 
1 22и ди 5 Find all the possible solutions of the heat- 
2 Oe z ag conduction equation 
1 ðu ou 
24 Taki -—=— 
20 кд. ae 
u= fta) of the form 


where f'is any function, find the values of o that 
will ensure that u satisfies the wave equation 
co ax = к”(3 соз°0— 1) 


u(x, t) = e“V(x) 


6 Find the values of the constant n for which 


satisfies the Laplace equation (in spherical polar 


3 Verify that the function Р 
coordinates and independent of ф) 


и(х, у) = х* – бху? + у 
Q( 29V 1 df. 49V 
satisfies the Laplace equation. 3 (r x) + sin 20 (sin A =0 
4 The function z(r, t) depends only on the radial 
distance in spherical polar coordinates and on 


the time. The wave equation in this coordinate 7 Show that u 2 e " cos mxcos nt is a solution of the 
system is equation 


for all values of the variables r and 0. 


10 


E 9? gu 9^u и 2 9ч 
ox of 7 
provided that the constants k, m, n and c are related 


by the equation n? + А? = с2т2. 11 


If V = x° + axy’, where a is a constant, show that 


Find the value of a if V is to satisfy the equation 
2 2 
Ox ду 


Taking this value of a, show that if u =7°V, where 
r =x +y’, then 


Ou, 9^u u 


c 5 = 27TrV 
x 


The telegraph equation has the form 


24. 102,4) 


ax of | t 
2; . ] ї2 
where c^ is the speed of light and k is usually 
small. Given that dXx, f) is a solution of the wave 
equation 
2o 199 
ox c o 
show that (x, f) = ®(x, t) e "? is a solution of the 
telegraph equation, if terms of order k? can be 
neglected. ЇЗ 
The transmission-line equations represent the flow 
of current along a long, leaky wire such as a 
transatlantic cable. The equations take the form 
al Qv 
-— =рр+ 
a ae 
-2* . ,r., 1I 
ox ot 14 


where g, c, r and L are constants and / and v are the (B) 
current and voltage respectively. 


(a) Show that when r 2 g — 0, the equations reduce 
to the wave equation. 

(b) Show that when L = 0, the equations reduce 
to a heat-conduction equation with a 
forcing term. Write W = v e% to reduce 
to the normal form of the equation. 
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(с) Put a= } (r/L + g/c) and then w = ve“. Show 
that when rc = gL, w satisfies the wave 
equation. 


Show that if fis a function of x only then 


u = f(x) sin(ay + b) 


where a and b are constants, is a solution of the 
partial differential equation 


PuPu y du 


Jy ax’ dx 


provided that f(x) satisfies the ordinary differential 
equation 


2 
€f 2а. Фу-0 
ах dx 


Hence show that 
и = (А + Bx) e" sin (ay * b) 
where A and B are arbitrary constants, is a solution 


of the partial differential equation. 


Show that f(x, y) 2 x^»? 
differential equation 


ey = 4х?у 


+ g(x/y) satisfies the partial 


for any arbitrary function g. It is given that f= £ оп 
the line with parametric equation x = 1 — t, y = t; 
find the function g. 

Show that the partial differential equation 


Ou „9и 
eu Ox 


has the general solution 


ulx, y) = e?LfG) * 02] 


where fand g are arbitrary functions. 


=0 


Find the general solution for u(x, y) in the equation 
(check using MAPLE) 


E ди | y gu 
m У Jy у 


Show that the solution that satisfies the conditions 
u=s’,x=s,y=1 takes the form 


=(x+y)u 


= x 2 
ху-х+у 
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Solution of the wave equation 


9.3.1 


In this section we consider methods of solving the wave equation introduced in 
Section 9.2.1. 


D'Alembert solution and characteristics 
A classical solution of the one-dimensional wave equation 


1 ðu ди 
1-20 9.4 
cof ax а 


is obtained by changing the axes to reduce the equation to a particularly simple form. 
Let 


r-x-ct, s=x-ct 


Then, using the chain rule procedure for transformation of coordinates (see Section 
3.1.1), 


иы = „+ 2и + Ug 
и, = си, — 2u,, ^ u,) 
so that the wave equation (9.4) becomes 
4clu,, —- 0 
This equation can now be integrated once with respect to s to give 


EN 
ze MP 


и, 


where 0 is an arbitrary function of r. Now, integrating with respect to r, we obtain 
u= f(r) + g(s) 
which, on substituting for r and s, gives the solution of the wave equation (9.4) as 
u = f(x + ct) + g(x — ct) (9.17) 


where fand g are arbitrary functions and fis just the integral of the arbitrary function 0. 

The solution (9.17) is one of the few cases where the general solution of a partial 
differential equation can be found. However, finding the precise form of the arbitrary 
functions fand g that satisfy given initial data is not always easy. The initial conditions 
must give just enough information to evaluate fand g, which are functions of the single 
variables r = x + ct and s = x — ct respectively. 

In Example 9.2 we have already seen a simple example of a wave of this type. We 
first deduced that a function of x — ct satisfied the wave equation, and then showed in 
Figure 9.2 that it represented a wave travelling in the x direction with velocity c. 

The next example is similar. 
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Example 9.10 


Solution 


Check that u = l/[1 + (x + cf)’] satisfies the wave equation (9.4) and show that it 
represents a travelling wave in the —x direction. 


Differentiating partially with respect to x and t 


5 AES G8) M _2[-1+43(x+er)’] 
* [1+(х+с)?]?” +æ T 
iis —2с(х + ct) pP. -14+3(x+ct) 
П + (+ ср "oo [Ge erp 


and the wave equation is satisfied. Plots of the function wu against x for various values 
of ct are shown in Figure 9.8. The same curve can be seen to be just translated to the left. 





0.9 - 
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0.7r 
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Figure 9.8 Solution to Example 9.10 showing uw against x for various values of ct. 


Example 9.11 


In Example 9.11 we attempt the more difficult task of fitting initial conditions to the 
solution. 


Solve the wave equation (9.4) subject to the conditions 
(a) zero initial velocity, ди(х, 0)/9 = 0 for all x, and 
(b) an initial displacement given by 
lex (0=%< 1) 
u(x, 0)=F(Qx)=j)14+x (-1 <x <0) 


0 otherwise 
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Solution 


-l о ых 


Figure 9.9 Initial 
displacement in 
Example 9.11. 


Figure 9.10 Solution 
to Example 9.11 
showing two waves 
propagating in the 

+x and —x directions 
with velocity c. 


This example corresponds physically to an infinite string initially at rest, and displaced 
as in Figure 9.9, which is then released. 
From (9.17) we have a solution of the wave equation as 


u = f(x + ct) + g(x — ct) 
We now fit the given boundary data. Condition (a) gives 
0-cf'(x)-cg'(x) forall x 
so that 
fŒ) -— g(x) = K = an arbitrary constant 
and thus 
u = f(x + ct) + f(x — ct) - K 
Similarly, condition (b) gives 
К(х) = 2/0) — К 
so that 
u -iF(xtct)tiF(x-ct) (9.18) 


We now have the solution to the equation in terms ofthe function F defined in condition 
(b). (Note that the same is true for any function F.) 

The solution is plotted in Figure 9.10 as wu against x for given times. It may be 
observed from this example that we have two travelling waves, one propagating to the 
right and one to the left. The initial shape is propagated exactly, except for a factor of 
two, and the shape discontinuities are not smoothed out, as noted in Section 9.2.1. 





The analysis in Example 9.11 can be extended to solve the wave equation subject to 
the general conditions 


(a) an initial velocity, du(x, 0)/0t = G(x), and 
(b) an initial displacement, u(x, 0) = F(x) for all x. 
Condition (a) gives, from (9.17), 
G(x) = df- go] 
so that 


x 


cL f(x) - g(x)] = | G(x) dx * Kc 


Condition (b) gives 
Лх) + а(х) = Рх) 


Figure 9.11 Solution 
to Example 9.11 in 
(x, t, u) space. 
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and we can solve for f(x) and g(x) as 
fGQ) 2 iFQ) 2) О(х)йх+1К 
c 
0 


1 


а(х) = F(x) Е | G(x)dx- IK 


с 
The solution thus becomes 
x+ct 
и= ЦЕ(х+ с) + F(x- ct)]+ Al G(z)dz (9.19) 
x-ct 
which is commonly called the d’Alembert solution. As in Examples 9.2, 9.10 and 
9.11, it gives rise to waves propagating in the +x and —x directions. 

As mentioned in Section 9.1, a difficulty is to illustrate the solution of a partial differential 
equation in a simple way. Figure 9.10 is a ‘snapshot’ at a particular time ¢, and if we wish 
to look at the solution over all (x, f) then we have to draw u as a function ofthe two variables 
x and t. We can draw the solution to Example 9.11 in a three-dimensional diagram as in 
Figure 9.11, but for any higher-dimensional problem such a diagram is clearly impossible. 
The ‘snapshot’ in Figure 9.10 corresponds to a plane slice parallel to the (uw, x) plane. 

The d'Alembert solution, which reduces to an integral along the boundary, does not 
have any simple extension other than for the x axis. In Section 9.7.2 the Green’s function 
is introduced and it can be interpreted as an extension since it involves integrals round 
the boundary of a general region. However, the calculation of the Green’s function is a 
tough proposition for any but the simplest regions. Following from the idea of the 
d’Alembert solution, characteristics (which will be studied in the next few paragraphs) 
can be used to extend the range of boundaries that can be dealt with. 

The idea of using an (x, t) plane is a very useful one for the wave equation, since the 
solution 


и = f(x + ct) + g(x — cf) 


gives a representation by characteristics. If we plot the lines x + ct = constant and 
х = сі = constant as in Figure 9.12 then we see that the line AP has equation x — ct = xy 
and the line BP has equation x + ct = x,. Thus 


on the whole of AP g(x — ct) = g(Xp) 
on the whole of BP f(x + ct) = f(x) 





746 PARTIAL DIFFERENTIAL EQUATIONS 


Figure 9.12 
Characteristics 

x + ct = constant and 
x — ct = constant. 





Thus g takes a constant value on AP and f takes a constant value on BP. If we can 
calculate fand g on the initial line t= 0 then we know the value of u at P, namely 


u(P) = f(x) + (х) (9.20) 


Since P is an arbitrary point, the solution at any point would be known. The essential 
problem is to calculate f(x) and g(x) on the line ¢= 0. 
Typical conditions on ѓ = 0 аге 


(a) u(x, 0) = F(x), and 
(b) du(x, 0)/dt = G(x), 


which specify the initial position and velocity of the system. Now 


cH + b = cf'(x + сї) + eg'(x — сї) + cf'(x * ct) — cg (x — ct) 2 2f (x * ct) 
x 
and similarly 

Qu Ou a ,_ 

ex S^ 2cg'(x — ct) 


On t = 0 we know that du/dx — F'(x) and ди/дї = G(x), so we can deduce that 
CF’ (x) + G(x) = 2cf"(x) 
cF'(x) — G(x) = 2cg’(x) 
Since F and G are given, we can compute 
Го) = ЕО) + GG)/c] 
g(x) = ЕО) – С(х)/с] 


and hence f(x) and g(x) can be computed by straightforward integration. 

This method is essentially the same as the d' Alembert method, but it concentrates 
on calculating f(x) and g(x) on the initial line and then constructing the solution at P by 
the characteristics AP and BP. The method gives great insight into the behaviour of the 
solution of such equations, but it is not an easy technique to use in practice. Perhaps the 
best that can be obtained from characteristics is an idea of how the solution depends on 
the initial data. In Figure 9.13 the characteristics emanating from the initial line are 
drawn. To evaluate the solution at P, we must have information on the section of the 
initial line AB, and the rest of the initial line is irrelevant to the solution at P. This is 
called the domain of dependence. The section of the initial line AB has a domain of 
influence determined by the characteristics through the points A and B. The data on AB 
cannot influence the solution outside the shaded region in Figure 9.13(b). 
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Figure 9.13 
Characteristics 
showing (a) the 
domain of dependence 
and (b) the domain of 
influence. 






Domain of influence 







A 


Domain of 


D» 3 
бау у. 
dependence / ле 


(а) (b) 


Example 9.12 Use characteristics to compute the solution of the one-dimensional wave equation (9.4), 
with speed c = 1, given (a) the initial conditions that u = V(x) and du/dt= 0, for x 0 
and ¢ = 0, and (b) the boundary condition that u — 0 at x 2 0 and t > 0. Describe the 
solutions in the particular cases 
(0 Их) = 1 and (ii) Vay = — 


xt] 





Solution The characteristics are plotted in Figure 9.14. It can be seen that for x > f, at a typical 
point P two characteristics emanating from the initial line, t= 0, meet and the solution 
can be computed at P from data on the initial line. However, for x < t, at a typical point 
Q the characteristics emanating from the boundary, x = 0, are required. These observa- 
tions will be borne out in the mathematical computations. 

For the region x > ¢ the characteristic analysis described in the previous analysis can 
be followed. It was shown that f and g in the solution 


u= f(x- t) + gx +t) (9.21) 
Figure 9.14 3 


Characteristics for 
| | 
0.5 


Example 9.12. 





© 
= 
л 
і 


1.5 2 2.5 


чө 


748 PARTIAL DIFFERENTIAL EQUATIONS 


can be calculated, taking G(x) = 0, as 
f'G)-iV'G) and g'G)-iV'G) 
Integrating, and putting the arbitrary constant to be zero, gives 
fG)-2iV() and gG)-iV() forx0 
and hence the solution 
u(x, N=} [V+ A+ V-A] fo xt (9.22) 


For the region x < t, (9.21) requires f(z) at negative values of z. The function is not yet 
known for negative values and must be determined by the other condition (b) on x = 0. 
The condition u = 0 at x 2 0 and t — 0 implies in (9.21) 


0 -f(-1) * gO 
and hence for a general variable z 
2) = 80) =-3 Ис) for z>0 
Using this result, (9.21) gives the required solution 
u(x, t2 ilWx-2)-V(t-x) fo x«t (9.23) 


The complete solution for all x > 0 and ¢ > 0 is now known from (9.22) and (9.23). 


Case (i) 


In this case V(x) = 1 so (9.22) gives u = 1, in the shaded region of Figure 9.14, and 
(9.23) gives u = 0, in the unshaded region of Figure 9.14. Thus 


1 for x>t 
u(x, t) = 
0 for x<t 


Note that the discontinuity in the boundary data at x = 0, t= 0 is propagated along the 
characteristic x = f. 
Case (ii) 


Putting the function V(x) = x/(x? + 1) into (9.22) and (9.23) gives the solution 


1 Kt x+t 


2 3 dee for x>t 
(x-f-1 “(x4+t) +1 

u(x, t) = 
аА рр х 
(t-x) +1 (хт) +1 


The boundary data are now smooth so the function u(x, £) remains smooth as illustrated 
for three cases in Figure 9.15. 

The basic physical problem described in this example is a very long string held at 
one end and initially at rest. The string is then displaced at ¢ = 0 in the shape of the 
function V(x) and released. 


Figure 9.15 
Smooth solutions 
for Example 9.12, 
Case (ii); string 
displacements at 
various times. 


Example 9.13 


Solution 
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In more complicated problems, the evaluation of the arbitrary functions f and g in 


equation (9.17) and the use of characteristics is no longer straightforward. We do not 
have a d’Alembert type of solution; a great deal of thought and care is needed. 


The idea of characteristics can be applied to more general classes of second-order 


partial differential equations. Example 9.13 illustrates a case of a constant-coefficient 
equation. 


Find the characteristics of the equation 


О= ц. + 2и, + 20и, 


Study the case when a = = and the solution satisfies the boundary conditions 


(a) 
(b) 


(c) 
(d) 


ди(х, 0)/91= 0 forxz0 
1 (0<х<1) 
0 (x21) 


u(0,t)=0 for all t 
Qu(0, t) gx 20 гаг 


u(x, 0) = F(x) = | 


Since the coefficients of the equation are constants, we know that the characteristics are 
straight lines, so we look for solutions of the form 


u = u(x + af) 


Putting z = x + at and writing и = du/dz and so on, we obtain 


O=u,, + 2u,, + 20и,= (1+ 2а + 2а?о)и” 


Hence for a solution we require 


or 


1+2a+2a°a=0 


_-1+j(- 20) 
g= 


2a 
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Figure 9.16 
Characteristic solution 
of Example 9.13. 

The solution u takes 
the constant values 
shown in the six 
regions of the first 
quadrant. 


Го > i then the two values of a (a, and a, say) are complex, and the characteristics 
x + a,t = constant and x + a,t = constant do not make sense in the real plane. 

If a= i then both roots give a = 1, and we only have a single characteristic x + t= 
constant, which is not useful for further computation. 

For the case œ < 5, we find two real values for a and two sets of characteristics. 

It is precisely for this reason that characteristics serve no useful purpose for the heat- 
conduction or Laplace equations. A further discussion can be found in Section 9.8 after 
the formal classification of equations has been completed. 

Take the case œ=  ; then we obtain a, =—2 and a, = —2, so the solution has the form 


u = f(x — 20) g(x — 2f) 


where f and g are arbitrary functions and the characteristics are the straight lines x — 2t 
= constant and x — 2 t = constant. 

The boundary conditions given in the problem are a little more complicated than in 
the d' Alembert solution. Conditions (a) and (b) give 


29u(x,0) | pc 297 
c ттнен е e 
F(x) = и(х, 0) = /(х) + а(х) 


Taking /(0) 2 g(0) 2 0, we can integrate the first of these expressions and then solve for 
f(x) and g(x) on the line t= 0 as 


fŒ =- F, а(х) = 5) (х 2 0) 
Conditions (c) and (d) say that u(0, 0) = 0, апа 2и(0, д /9х = 0. Thus on the line x = 0 
we deduce 

№) = (0) =0 (2< 0) 


We can now construct the solution by characteristics. Figure 9.16 illustrates this 
solution. Because f(x) and g(x) are constant along the respective characteristics, we 
deduce u(A) = 0, u(B) = —}, u(C) = 1, (D) - 0, (E) - 2, u(F) = 0 at typical points in 
the six regions that divide up the first quadrant of the (x, f) plane. 


Time A 








o 0.5 1.0 15 2.0 2.5 3.0 3.5 


Distance 
‚ 3х = 21, —-—,3x=2t+3; ———,x-2t; ----,x=2t+1. 





9.3.2 
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For non-constant-coefficient equations the characteristics are not usually straight lines, 
which causes computational difficulties. In particular, there are some fundamental prob- 
lems when characteristics of the same family intersect. The solution loses its uniqueness, 
and ‘shocks’ can be generated. The classical wave equation (9.4) will propagate these 
shocks, but it requires ‘curved characteristics’ to generate them. 


Separated solutions 


A method of considerable importance is the method of separation of variables. The 
basis of the method is to attempt to look for solutions u(x, y) of a partial differential 
equation as a product of functions of single variables 


u(x, y) = XY) 


The advantage of this approach is that it is sometimes possible to find X and Y as solu- 
tions of ordinary differential equations. These are very much easier to solve than partial 
differential equations, and it may be possible to build up solutions of the full equation 
in terms of the solutions for X and Y. A simple example illustrates the general strategy. 
Suppose that we wish to solve 


ди ‚ди 
—+—=0 
ox oy 
Then we should write u 2 X(x)Y( y) and substitute 
yiyo, o PUES Ir 
dx dy X dx Ydy 


Note that the partial differentials become ordinary differentials, since the functions are 
just functions of a single variable. Now 


LHS = ra. a function of x only 
X dx 
RHS = oe a function of y only 
Ydy 


Since LHS = RHS for all x and y, the only way that this can be achieved is for each side 
to be a constant. We thus have two ordinary differential equations 


I-a I 
Хах ^ Ydy 
These equations can be solved easily as 
Х=Ве У=Се” 
and thus the solution of the original partial differential equation is 
цо, у) = ХО) = ей 


where A = BC. The constants A and À are arbitrary. The crucial question is whether the 
boundary conditions imposed by the problem can be satisfied by a sum of solutions of 
this type. 

The method of separation of variables can be a very powerful technique, and we 
shall see it used on all three of the basic partial differential equations. It should be 
noted, however, that all equations do not have separated solutions, see Example 9.2, 
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Example 9.14 


and even when they can be obtained it is not always possible to satisfy the boundary 
conditions with such solutions. 

In the case of the heat-conduction equation and the wave equation, the form of one 
of the functions in the separated solution is dictated by the physics of the problem. We 
shall see that the separation technique becomes a little simpler when such physical 
arguments are used. However, for the Laplace equation there is no help from the 
physics, so the method just described needs to be applied. 

In most wave equation problems we are looking for either a travelling-wave solution 
as in Section 9.3.1 or for periodic solutions, as a result of plucking a violin string for 
instance. It therefore seems natural to look for specific solutions that have periodicity 
built into them. These will not be general solutions, but they will be seen to be useful 
for a whole class of problems. The essential mathematical simplicity of the method 
comes from only having to solve ordinary differential equations. 

The above argument suggests that we seek solutions of the wave equation 


eos 

5 T Е 2s (9.4) 
of the form either 

и = sin (cAt)v(x) (9.24a) 
or 

u = cos (cA wx) (9.24b) 


both of which when substituted into (9.4) give the ordinary differential equation 


2 
a --Av 
dx 


This is a simple harmonic equation with solutions v = sin Ax or v= cos Ax. We can thus 
build up a general solution of (9.4) from linear multiples of the four basic solutions 


u, = cos Act sinAx (9.25a) 
и = cos Act cos Ax (9.25b) 
из = sin Act sin Ax (9.25c) 
u, = sin Act cos Ax (9.25d) 


and try to satisfy the boundary conditions using appropriate linear combinations of 
solutions of this type. We saw an example of such a solution in Example 9.1. 


Solve the wave equation (9.4) for the vibration of a string stretched between the points 
x 2 0 and x 2 / and subject to the boundary conditions 


(a) u(0,t)=0 (t=O) (fixed at the end x = 0); 
(b  u(Lf)20 (£20) (fixed at the endx=/); 
(c) дц(х, 0)/01=0 (0x </) (with zero initial velocity); 


(d) u(x, 0)= F(x) (given initial displacement). 


Solution 
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Consider the two cases 
(i)  F(x)-sin(nx/I) + isin (31x /1) 


x (0 x x s 1l) 
(i) ra=] 


i-x (sx) 


Clearly, we are solving the problem of a stretched string, held at its ends x = 0 and x =/ 
and released from rest. 

By inspection, we see that the solutions (9.25b, d) cannot satisfy condition (a). We 
see that condition (b) is satisfied by the solutions (9.25a, c), provided that 


sinAl=0, or Al=nn (n=1,2,3,...) 


It may be noted that only specific values of A in (9.25) give permissible solutions. Thus 
the string can only vibrate with given frequencies, nc/2/. The solution (9.25) appropriate 
to this problem takes the form either 





и = COS ЕЗ sin (=) (9.26a) 
or 

u= sin( 2em) sin( 27x) (9.26b) 
(n=1,2,3,...). To satisfy condition (c) for all x, we must choose the solution (9.26a) 


and omit (9.26b). Clearly, it is not possible to satisfy the initial condition (d) with 
(9.26a). However, because the wave equation is linear, any sum of such solutions is also 
a solution. Thus we build up a solution 


“= УЬ, cos (4686) sin (4) (9.27) 
п=1 





Case (i) 


The initial condition (d) for u(x, 0) gives 


x b, sin (sm = sin (=) + isin (=) 
n-l 


and the values of b, can be evaluated by inspection as 





b,=1,b,=0, b=}, b,=b;=...=0 


1 
4? 





The full solution is therefore 


_ лсі\. (пх\ 1 3mct| . (3nx 
и = cos n sin T +3 cos TP sin "n 


The solution is illustrated in Figure 9.17. 
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Figure 9.17 
Sketch of the solution 
to Example 9.14 (1) 


with c — 1 and / — І. 











Case (ii) 
The condition (d) for u(x, 0) simply gives 


(0 € x « 1I) 
b ATX = 2 
у ssn )- = b х (Ч=х «0 


п=1 


and thus to determine 5, we must find the Fourier sine series expansion of the function 


f(x) over the finite interval 0 « x « /. We have from (7.33) that 








1 
EC | fo) sin (£8) as 
І І 
0 
1/2 1 
=“ хеп (9) +2 (1- x)sin (27%) ax 
/ І І / 
0 1/2 
=L sinnn) (n21,2,3,...) 


The complete solution of the wave equation in this case is therefore 


u(x, t) = 4 ay = L sin (4 пт) соз (=) sin (=) (9.28) 
т^ = n [ l 


сті Tx) | 3emt| . (3mx 
u(x, t)= cos sin - 3 cos| —— |sin| —— 
т^ 1 1 І 1 
+z cos (222 sin( 2) 4 ee | 


The complete solution to Example 9.14 Case (11) gives some very useful information. 
We see that all the even *harmonics' have disappeared from the solution and the ampli- 
tudes of the harmonics decrease like 1/7?. A beautiful theory of musical instruments 
can be built up from such solutions. We see that for different instruments different 





or 


Figure 9.18 Solution 
of Example 9.14 (ii) 
with c= + and/=1. 


Example 9.15 


Solution 
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Displacement 





О 0.2 04 0.6 0.8 1.0 1.2 
Distance 
——,‚'=0; =.=, / = 0.2; ..--,/=0.4; 


fed: А 


harmonics are important and have different amplitudes. It is this that gives an instru- 
ment its characteristic sound. 

A sensible question that we can ask is whether we can use the sum of the series in 
(9.28) to plot u. In Chapter 7 we saw that convergence is not simple to prove, but the 
Dirichlet conditions in Theorem 7.2 are sufficient to ensure pointwise convergence. 
At discontinuities in particular, Fourier series can be very slow to converge, so that, 
although (9.28) is a complete solution, does it provide us with any useful information? 
In the present case there is no particular problem, but the general comment should be 
noted. The solution (9.28) is plotted in Figure 9.18 with / 2 1 and c — ; . We note that 
even with 10 terms u(0.5, 0) 2 0.4899 instead of the correct value 0.5, so there is a 2% 
error in the calculated value. Perhaps the most pertinent comment that we can make is 
that a good number of terms in the series are required to obtain a solution, and exact 
solutions may not be as useful as we might expect. 

Separated solutions depend on judicious use of the known solutions (9.25) of the 
wave equation to fit the boundary conditions. Although it is not always possible to solve 
any particular problem using separated solutions, the idea is sufficiently straightforward 
that it is always worth a try. The extension to other equations and coordinate systems 
is possible. The use of other orthogonal functions was introduced in Section 7.7 and 
some of these will be discussed in Sections 9.4 and 9.5. 


Solve the wave equation (9.4) for vibrations in an organ pipe subject to the boundary 
conditions 


(a ш(0, = 0 (£20) (the end x =0 is closed); 
(b) dul, i\/ox=0 (t=O) (the end x =/ is open); 
(c) ux,0)=0 (OSxS/) (the pipe is initially undisturbed); 


(d) du(x,0V/d0t=v=constant (0<x</) (the pipe is given an initial uniform blow). 


From the solution (9.25), we deduce from condition (a) that solutions (9.25b, d) must be 
omitted, and similarly from condition (c) that solution (9.25a) is not useful. We are left with 
the solution (9.25c) to satisfy the boundary condition (b). This can only be satisfied if 


cosdl=0, or Al=(n+})n (n=0,1,2,...) 
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Thus we obtain solutions of the form 


1 t l 
u= basin on Ot (n20,1,2,...) 


giving a general solution 


Е «= Ўы i +5 prer sf de 


The condition (d) gives 


= Qut one Ime | мерт 


i 7 


which, on using (7.33) to obtain the coefficients of the Fourier sine series expansion of 
the constant v over the finite interval 0 « x « /, gives 


2v l _ 8lv 1 
~ (n+ )n(n+h Jne mc(2n+1) 


n 


Our complete solution of the wave equation is therefore 
8lv җ- 1 . Ds C us 1 
= = 3 80| (пи+ 1)л = |sin| (n * ;)r7 
lE | ў J | > l 


or, 
8lv| (=) : (=) ine (==) ‘ (=) 
и = = (sin sin| = | +3 sin| ——— |sin| —— 
Tc 21 21 2l 2l 
. (5псі\ . (5пх 
+ Ее же: | 


It would be instructive to compute this solution and compare it with Figure 9.18, which 
corresponds to the solution of Example 9.14. 


9.3.3 Laplace transform solution 


For linear problems that are time varying from 0 to ee, as in the case of the wave 
equation, Laplace transforms provide a formal method of solution. The only difficulty 
is whether the final inversion can be performed. 

First we obtain the Laplace transforms of the partial derivatives 


ди ди ди ди 
ox Ot o ar 


of the function u(x, £), t 2 0. Using the same procedure as that used to obtain the 
Laplace transform of standard derivatives in Section 5.3.1, we have the following: 


(a) 


(b) 


(c) 


(d) 
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ди _ 7 -st Qu -d 7 —st 
а ЕШ ife u(x, t)dt 


using Leibniz’s rule (see Modern Engineering Mathematics) for differentiation 
under an integral sign. Noting that 


со 


S [u(x, t)) 2 U(x, s) -| e " u(x, t) df 


0 


we have 


ди 4 
dal- S UG. 8) (9.29) 


Writing y(x, £) = du/ox, repeated application of the result (9.29) gives 


dy| d zd. fd 
А |- qr USE (s UG s)) 


ox 
so that 
2 2 
Ael - 006.5) (9.30) 
дх dx 


оо 


=[e “u(x, t]o «| e" u(x, t) dt 2 [0 - u(x, 0)] - sU(x, s) 


0 


so that 


|а.) = sU(x, s) - u(x, 0) (9.31) 


where we have assumed that u(x, t) is of exponential order. 


Writing v(x, f) = Ou/ot, repeated application of (9.31) gives 


Axl = sV(x, s)- v(x, 0) 
= s[sU(x, s) — u(x, 0)] — v(x, 0) 
so that 
s = sS U(x, s) - su(x, 0) - u(x, 0) (9.32) 
t 


where u(x, 0) denotes the value of du/ot at t = 0. 
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Example 9.16 


Solution 


Let us now return to consider the wave equation (9.4) 


еи gu 
dx of 
subject to the boundary conditions u(x, 0) = f(x) and u(x, 0)/dt = g(x). Taking Laplace 
transforms on both sides of (9.4) and using the results (9.30) and (9.32) gives 


c ыд. 5) _ s'U(x, 5) - g(x) - sf(x) (9.33) 
X 


The problem has thus been reduced to an ordinary differential equation in U(x, s) of a 
straightforward type. It can be solved for given conditions at the ends of the x range, 
and the solution can then be inverted to give u(x, f). 

MAPLE or MATLAB can be used to assist with the transforms and the inverse 
transforms but considerable experience is needed to convert results to a simple form. 


Solve the wave equation (9.4) for a semi-infinite string by Laplace transforms, given that 
(a  u(x,0)20 (х2 0) (string initially undisturbed); 

(b) du(x, 0/dt=xe*" (x20) (string given an initial velocity); 

(c) u(0,)-20 (£20) (string held at x = 0); 

(d) и(х, 7) Э 0 asx- eefortz 0 (string held at infinity). 


Using conditions (a) and (b) and substituting for f(x) and g(x) in the result (9.33), the 
transformed equation in this case is 


c a U(x, s) = 9 U(x, s) -x e?* 
dx 
By seeking a particular integral of the form 
U = axe" + Ве 
we obtain a solution of the differential equation as 


—x/a 2 
U(x, s) 2 Ae Ве" – е z 2c'la | 


gag cla -s° 
where A and B are arbitrary constants. 

Transforming the given boundary conditions (c) and (d), we have U(0, s) = 0 and 
U(x, s) > 0 as x 4 ©, which can be used to determine A and B. From the second 
condition A = 0, and the first condition then gives 

2 
B Е 2 = “ 242 
(c/a - s^) 


so that the solution becomes 


2cla e ga x 2c'la | 


U(x, s)= 
(cla? xay (cla? - 5°) (eld - 5°) 


Il 


16 


17 


(m) 2h 92 _ 
2—X cix -ct) |. x-ct 4 
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Fortunately in this case these transforms can be inverted from tables of Laplace 


transforms. 


Using the second shift theorem (5.45) together with the Laplace transform pairs 


0 


2 2? 
S -0 





FZI sinh @t} = 


| ot cosh wt - sinh at _ 


20 


we obtain the solution as 





u=% [e - x) cosh (s - 
c a 


L{coshat} = — 





2 2 
5 -@ 


(s Е ay 


rye a 


a 





+ [a +a) sinh (2) - a sinh Ge 3j H(ct - J 


where H(f) is the Heaviside step function defined in Section 5.5.1. 


9.3.4 Exercises 


Solve the wave equation 

Фи lgu 

dX co 
subject to the initial conditions 
(a) u(x, 0) 2sinx (all x) 


ди 


(b $*6,0)-0 (аә) 


Use both the d'Alembert solution and the 
separation of variables method and show 
that they both give the same result. 


Find the separated solution of the wave equation 


o) dg 
that satisfies the initial conditions 


u(x, 0) = 0, 2 (x, 0) = sin x(1 + cos x) 


Show that 


X-ter 


1-4ext- (xo - £y! de (x-et 1-(x+tcty 


18 


o 


20 


Hence deduce that the function satisfies the wave 
equation. Check that this differential equation is 
satisfied using MAPLE. 


The spherically symmetric version of the wave 
equation (9.4) takes the form 


1 9'u | u, 2u 
co) Or roar 

Show, by putting v = ru, that it has a solution 
ru — f(ct — r) * g(ct * r) 


Interpret the terms as spherical waves. 


Using the trigonometric identity 
sin A cos B = } sin(A — B) + } sin(4 + B) 


rewrite the solution (9.28) to Example 9.14 as a 
progressive wave. 


Solve the wave equation 
ди 12u 
ox co 


subject to the initial conditions 


(a) u(x, 0) 20 (all x) 
ди ыш 
(b) 3 (x, 0) 2xe (all x) 
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22 


23 


24 
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Find the solutions to the wave equation (9.4) 25 
subject to the boundary conditions 


(а) ди(х, 0)/91=0 forall x 


1-х (0=х%1) 
(b u-314x (-1«€x«0) att-20 
0 (|x| = 1) 
using d' Alembert's method. Compare with 26 


Example 9.11. 


Compute the characteristics of the equation 


Btk + би„+и„=0 


Show that the partial differential equation 


has solutions of the form u(x, t) = f(x + At) provided 
either A = 2 or A =-3. Given 


ди 


u(x,0)=x?-1 and àr (x, 0) = 2x for all x 


e 


find the solution for u. 


The function u(r, f) satisfies the partial differential 
equation 


u 2 ди 
d?) ror 


where c is a positive constant. Show that this 
equation has a solution of the form 


r 
и= gi) cos ot 
r 


where q is a constant and g satisfies 


2 2 
dg, 08 _9 
dr с 


Show that, if u satisfies the conditions 
u(a, t) = B cos ct 
u(b, t) 2 0 


then the solution is 


. Dacos ot sin[ ob - r)/c 
u(r,t)- 


r sin[ @(b - a)/c] 


Use characteristics to compute the solution of the 
wave equation (9.4), with speed c = 1, given the 
initial conditions that for all x and t= 0 


(a) u=0 (b) ди/дї = exp(-|x|) 


Use a time step of 0.5 to compute (on a spreadsheet 
or other package) the first four steps over the range 
-3«x«3. 


Using the separated solution approach of 
Section 9.3.2, obtain a series solution of the 
wave equation 
ди 1 ди 
д с? ӘР 
subject to the boundary conditions 
(a) u(0,0) 20 (t0) 
(6) ди(х, 0)/901= 0 (0 <х<л) 
(с) и(п, ) = 0 (120) 


(9) и(х, 0) =лх-х2 (0 <х < л) 

The end at x = 0 of an infinitely long string, initially 
at rest along the x axis, undergoes a periodic 
displacement a sin æt, for t > 0, transverse to the 
x axis. The displacement u(x, t) of any point on 
the string at any time is given by the solution of 
the wave equation 


2) 2 
Qu »Qu 
ТС ту 


Or ae 


ài (x > 0,¢t > 0) 
subject to the boundary conditions 
(a) u(x,0)=0 (x>0) 

(6) ди(х, 0)/91=0 (x > 0) 

(c) u(0, f) 2asinot (t> 0) 
(d) |u(x, D| « L, L constant 


where the last condition specifies that the 
displacement is bounded. 

Using the Laplace transform method, show that 
the displacement is given by 


u(x, t)h=a sno i ‘J H(t = z) 


where H(t) is the Heaviside step function. 
Plot a graph of u(x, £), and discuss. 


2.3.5 


Figure 9.19 Mesh 
points for a numerical 
solution of the wave 
equation. 
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Numerical solution 


For all but the simplest problems, we have to find a numerical solution. In Section 9.3.1 
we saw that characteristics give a possible numerical way of working by extending the 
solution away from the initial line. While this method is possible, it is difficult to program 
except for the simplest problems, where other methods would be preferred anyway. 
In particular, when characteristics are curved it becomes difficult to keep track of the 
solution front. However, calculus methods suffer because they cannot cope with dis- 
continuities, so that, should these occur, the methods described in this section will tend 
to ‘smear out’ the shocks. Characteristics provide one of the few methods that will trap 
the shocks when we use the fact that the latter are propagated along the characteristics. 

The numerical solution of ordinary differential equations was studied in some detail in 
Chapter 2. The basis of the methods was to construct approximations to differentials in 
terms of values of the required function at discrete points. The commonest approximation 
was the ‘chord approximation’ 


df(a) _ flath) -fla-h) 
dx 2h 


and for the second derivative 


а/а) _ Ка+ћ) -2Да) + Қа-һ) 
ах? р? 


The justification of these approximations and the computation of the errors involved depend 
on the Taylor expansions of the functions. In partial differentiation the approximations are 
the same except that there is a partial derivative in both x and t for the function u(x, t). 

Figure 9.19 illustrates a mesh of points, or nodes, with spacing Ax in the x direction 
and At in the t direction. Each node is specified by a pair of integers (i, j), so that 
the coordinates of the nodal points take the form 


x,=a+iAx, t=b+jAt 


and a and b specify the origin chosen. The mesh points or nodes lie on the intersection 
of the rows ( j = constant) and columns (i = constant). 


Columns 
i-l i i+l 


tA 


Rows 
j*1 











2 


762 PARTIAL DIFFERENTIAL EQUATIONS 


The approximations are applied to a typical point P, with discretized coordinates 
(i, j), and with increments Ax — x;,, — x; and At — f, — 5, which are taken to be uniform 
through the mesh. We know that at the points A and B we can approximate 


~ u(i, j)-u(i-1,J) 
A 


Ox Ax 
au Hit LJ) = LJ) 
9х2} Ах 


so that the second derivative at P has the numerical form 


ди _ (Qu/dx)s - (Qu/Ox), _ u(i * 1,7) - 2u(i, 7) * u(i - 1,7) 


ax Ax Ax? 
Similarly, 

ди WJ +l) 2 и 70) 

or At 


Thus the wave equation 07u/ot? = c’0°u/dx? becomes 


U SERA EPI I) ae Ad = SD) 


At 
3 e UGL 2и(ї, Ј) +и(і- 1,7) 


2 


Ах 
which can be rearranged as 
и 3y2u(5j)-—wy-—1 
t A [uG 1, j) - 2u(i, 7) + ui - 1, /)] (9.34) 
where 


A= c Atl Ax 


Equation (9.34) is a finite-difference representation of the wave equation, and provided 
that u is known on rows j — 1 and j then u(i, j - 1) can be computed on row j + | from 
(9.34) and thus the solution continued. On the zeroth row the boundary conditions 
u(x, 0) = f(x) and du(x, 0)/dt = g(x) are known, so that f; = u(i, 0) and g; = au(i, 0)/0t 
are also known at each node on this row, and these are used to start the process off. 
From Figure 9.20, we see that 


Qu  u(i, 1) - u(i, —1) 
„се ОЕ = БО › 9.35 
S= r 2At O39) 


Figure 9.20 The first 
rows of mesh points in 
a numerical solution of 
the wave equation. 





Example 9.17 


Solution 
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Now (9.34) with j 2 0 becomes 
u(i, 1) 2 2u(i, 0) — u(i, 21) + A? [u(i - 1, 0) — 2u(i, 0) + u(i — 1, 0)] 
Since u(i, 0) = f; and u(i, —1) = u(i, 1) — 2At g;, (9.34) now takes the form 
uli, 1)=(1- AW) F434? (far tha) + Atg (9.36) 


Thus the basic strategy is to compute row zero from u(i, 0) = f;, evaluate row one from 
(9.36), and then march forward for general row j by (9.34). 


Solve the wave equation 0°u/dt? = c’0°u/dx* numerically with the conditions 
(а) ш(х, 0) = ѕіп(лх) (0<x <1) (initial displacement); 

(6) ди(х, 0)/91=0 (0 =х=1) (initially at rest); 

(с) (0, = щ(1, = 0 (72 0) (the two ends held fixed). 

Use the values c = 1, Ax = 0.25, At= 0.1. 


Note that A? — 0.16. The values at f= 0 are given by condition (a) 


x | 0 0.25 0.5 0.75 1 
u | 0 0.7071 І 0.7071 0 





The values at t 2 0.1 (or j 2 1) are computed from (9.36) with /; = ѕіп(лх) 
u(i, 1) = 0.84f; + 0.08( fir +f-1) 





and give 
x | 0 0.25 0.5 0.75 1 
u | 0 0.674 0.9531 0.674 0 


The first two rows are now complete, so formula (9.34) can be used for each of the 
subsequent times, for t = 0.2 (or j = 2) 


и(ї, 2) = 2u(i, 1) — u(i, 0) + 0.16[u(i * 1, 1) — 2u(i, 1)  u(i — 1, 1)] 
which gives 
x 0 0.25 0.5 0.75 1 


u 0 0.5777 0.8169 0.5777 0 


and for t= 0.3 (or j = 3) 





x | 0 0.25 0.5 0.75 1 
и | 0 0.4272 0.6042 0.4272 0 
and so on. 


This problem has an exact solution so the results can be compared with 
u(x, t) = ѕіп(лх) cos(nt). 
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a 


Example 9.18 


Solution 


Figure 9.21 Solution 
of Example 9.18 with 
Ax 20.2, A42 0.5 

for successive 

values of ct. 


Numerical calculations can be performed very efficiently with MATLAB: the ‘colon’ 
notation allows complex manipulations of sub-matrices to be done and makes the 
package ideally suited to this type of computation. The instructions, for n mesh 
points and general parameter L = 4?, 


ial) pli 0s 3E E $ values in the example 
See [Ope еке TE Еос еа оо та е ите 
О Та Ата а Пао Е INS er D 22 CO] 
$sets up second line 
сше ОИ о кш ОЕ ca (M ОЛЕ КЕК ү БЕКЕ EE 
2z ет уки Ет О 
$sets up the third line 
т=р=; $ prepares for subsequent 
lines 


produce the solution to this problem. Repeating the last two lines continues the solu- 
tion for ¢ incremented by At. 


Solve the wave equation 07u/0t? = c’ 0’u/Ox? for a semi-infinite string, given the initial 
conditions 


(a) u(x,0)2xexp[-5(x— IY] (х 2 0) (string given an initial displacement); 
(b) du(x, 0)/Ot=0 (x =0) (string at rest initially); 
(c) u(0,t)=0 (t=O) (string held at the point x = 0). 


Since g;= 0 in (9.36), only the one parameter J needs to be specified. Figure 9.21 shows 
the solution of u over eight time steps with A = 0.5. It can be seen that the solution splits 
into two waves, one moving in the +x direction and the other in the —x direction. At a 
given time ¢ = 0.8/c, the u values are presented in the table shown in Figure 9.22 for 
various values of A. We see that for 2 < 1 the solution is reasonably consistent, and 
we have errors of a few per cent. 





Figure 9.22 

Table of values of u for 
a numerical solution 
of Example 9.18 with 
Ax = 0.2 and ct = 0.8. 
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x 0 0.2 0.4 0.6 0.8 1.0 

u(A = 0.25) 0 0.3451 0.4674 0.3368 0.1353 0.0236 
u(A = 0.5) 0 0.3487 0.4665 0.3318 0.1340 0.0272 
u(A = 1) 0 0.3652 0.4582 0.3105 0.1322 0.0408 
u(A = 2) 0 0.1078 0.3571 0.6334 0.5742 0.2749 





However, for A = 2 the solution looks very suspect. A further two time steps gives, at 
ct = 1.6, the solution 


x | 0 0.2 0.4 0.6 0.8 1 





u(A = 2) | 0 =3.12 21.75 —10.25 —34.70 32.72 


Clearly the solution has gone wild! 


Looking back to Figure 9.20, we can attempt an explanation for the apparent diver- 
gence of the solution in Example 9.18. The characteristics through the points (x; ;, 0) 
апа (x;,;, 0) are 


х= х Сі 
Xiny =X + ct 
which can be solved to give, at the point P, 
Xp = 5 (Xia t xa) 
сір = 5 Ga 7X4) = Ax 


Recalling the work done on characteristics, we should require the new point to be inside 
the domain of dependence defined by the interval (x; |, x;,,). Hence we require 


tp = At 
SO 


cAt = 

7 1 (9.37) 
Indeed, a careful analysis, found in many specialist numerical analysis books, shows 
that this is precisely the condition for convergence of the method; it is commonly called 
the Courant, Friedricks and Levy (CFL) condition. 

The stringent condition on the time step At has always been considered to be a limita- 
tion on so-called explicit methods of the type described here, but such methods have 
the great merit of being very simple to program. As computers get faster, the very short 
time step is becoming less of a problem, and vector or array processors allow nodes to 
be dealt with simultaneously, thus making such methods even more competitive. 

There are, however, clear advantages in the stability of calculations if an implicit 
method is used. In Figure 9.19 the approximation to u,, may be formed by the average 
of the approximations from rows j - 1 and j — 1. Thus 
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Example 9.19 


Solution 


[u(i, j + 1) — 2u(i, j) + ui, j - D]/C AP 
=$[uGt+ 1, 7+ 1)-20G,j+ 1) +uG-1,7+ 1) 
+u(it 1,j—-1)- 2u(i, j- 1) + uli- 1, 7 — 1)/Ax? 


Assuming that u is known on rows j and j — 1, we can rearrange the equation into the 
convenient form 


-№и(і + 1,j - 1) - 2(1  Au(i, j - 1) - Au(i — 1, j 1) 
- 4Au(i, j) * Au(i - 1,j - 1) - 200. 22u(i, j - 1) - Au(i - 1, j — 1) (9.38) 


The right-hand side of (9.38) 1s known, since it depends only on rows j and j — 1. The 
unknowns on row j + | appear on the left-hand side. The equations must now be solved 
simultaneously using the Thomas algorithm for a tridiagonal matrix (described in 
Section 5.5.2 of Modern Engineering Mathematics). This algorithm is very rapid and 
requires little storage. It can be shown that the method will proceed satisfactorily for 
any A, so that the time step is unrestricted. The evaluation of rows 0 and 1 is the same 
as for the explicit method, so this can reduce the accuracy, and clearly the algorithm 
needs a finite x region to allow the matrix inversion. 


Solve the wave equation 0°u/dt? = c’0°u/dx’ by an implicit method given 
(a) u(0,t)=0 (72 0) (fixed atx = 0); 
(D  u(1,) 0 (r£20) (fixed at x= 1); 


(c) ди(х, 0)/901=0 (O<x <1) (zero initial velocity); 
1 = 1 
(d) u(r, 0) -| ve 


(displaced at the one point x = 1). 
0 otherwise 


Compare the solutions at a fixed time for various A. 


Here we have a wave equation solved for a string stretched between two points and 
displaced at a single point. 

The numerical solution shows the expected behaviour of a wave splitting into two 
waves, one moving in the — direction and the other in the +x direction. The waves are 
reflected from the ends, and eventually give a complicated wave shape. 

The computations were performed with Ax = 0.125 and various À or At, with 
A= cAtlAx. The values of u are given at the same time, T= Ax/c, for various A: 


x 0 0.125 0.25 0.375 0.5 0.625 0.75 0.875 1 
u(A = 0.2) 0 0.3394 0.2432 0.3412 0.0352 0.0019 0.0001 0 0 
u(A = 0.1) 0 0.3479 0.2297 0.3493 0.0344 0.0014 0 0 0 
u(A = 0.05) 0 0.3506 0.2254 0.3519 0.0341 0.0013 00 0 0 
u(A=0.025) | 0 0.3514 0.2243 0.3526 0.0340 0.0014 0 0 0 





Although the method converges for all A, the accuracy still requires a small A (or time 
step), but the value A= 0.05 certainly gives an accuracy of less than 1%. It may be noted 
that at the chosen value of 7 the wave has split but has not progressed far enough to be 
reflected from the end x = 1. 


28 


29 


= 
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Equation (9.38) can be written in matrix form 
AU,,-4Uj- AU,, or U,, -AN"Uj-U,;, 


where the U vectors represent the whole row of u values. This makes the problem 
ideal for MATLAB. The instructions, for » mesh points and general L — A?, 


па ESI ONT $values for the example 
a-[-L 2*(1«L) -L]; Aseye(n); for i-2:n-1, A(i,i-1:i«1)-a; end 
$sets up matrix A 
b-[L/2 1-L L/2]; Cseye(n); for i-2:n-1, C(i,i-1:i-«1)-b; end 
Е ООО ОООО ОЕ ОЕ ЕУ 6л 
$sets up lines one and two 
B=inv (A) ;w=4*B*v-u evaluates line three 
TUIS vp Ed eB Ys $continues the solution 
compute the solution. Repeating the last line continues the solution by an increment 


At. 


The methods described in this section all extend to higher dimensions, and some to 
nonlinear problems. The work involved is correspondingly greater of course. 


9.3.6 Exercises 


Use an explicit method to solve the wave 
equation 0?u/dt? = c?9?u/ax? for the boundary 
conditions 


(а) (0, ) =0 (72 0) 
(6) ш(1, = 0 (720) 
(с) и(х, 0) =0 (0 =х%1) 


(d) du 0) - d^ 0 =2= 5) 
1-х ( 


=x 


1 


Use Ax = At= } and study the behaviour for a 


variety of values of A for the first three time steps. 


Compare your result with the implicit version 
in (9.38). 


An oscillator is started at the end of a tube, 
and oscillations propagate according to the 
wave equation. The displacement u(x, f) 
satisfies 


2 
Ou 


-lgu 
o) cog 


30 


L] 


in 0 « x «€ I, for t — 0, with the boundary conditions 
(а) (0, ) = аѕіп 01, u(l,t) 0 (t 0) 


(b) u(x, 0) = 24605.) - (0 x « 1) 


where c, a and œ are real positive constants. 
Show that the solution of the partial differential 
equation is 


uis, Se a sin Qt sin [oX/ - x)/c] 


sin(ol/c) 


+ Y RT UE sin (472) g(a) 
provided that w//mc is not an integer. 

Compare this solution with one computed using 
the explicit numerical method. Use a= 1, /= 1, 
c=1,@= : T, Ax — 0.2 and At — 0.02 to evaluate 
u(x, 0.06). 


Solve the equation 


2 2 
Pyy Zu 
ox ot 
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numerically, subject to the conditions 31 Solve the equation 
2 2 
д 29и 99и 
и=х(1 х), z (0<х<1) аїт=0 S © ae = 
u=0 (x=0,1) fort>0 numerically, subject to the conditions 
Use u=x(1 —x), du o for allx att- 0 
(a) an explicit method with Ax — At — 0.2 and Use 
A=0.5; 
(b) an implicit method with Ax = At = 0.2 and (a) an explicit method with Ax = Ar = 0.2 and 
Az i5. À — 0.5; 
(b) an implicit method with Ax 2 At 2 0.2 and 
Compare your solution with that in Exercise 31. А = 0.5. 


EMEN Solution of the heat-conduction/diffusion equation 


In this section we consider methods for solving the heat-conduction/diffusion equation 
introduced in Section 9.2.2. 


9.4.1 Separation method 


It was with the aim of solving heat-conduction problems that Fourier (c. 1800) first used 
the idea of separation of variables and Fourier series. As indicated in Section 7.1, many 
mathematicians at the time argued about the validity of his approach, while he continued 
to solve many practical problems. 


In Section 9.2.2 we noted that the heat-conduction equation 
1 ди 2 
-—— = Ү 9.5 
к дї g ea 


has a steady-state solution U, provided there are no time varying inputs, satisfying 
VU=0 

and appropriate boundary conditions. One useful way to write the general solution is 
u=U+v 


where v also satisfies (9.5) and the boundary conditions for u — U. Certainly the heat- 
conduction interpretation supports this idea, and we base our strategy on first finding U 
and then determining the transient v that takes the solution from its initial to its final state. 
We note that v — 0 as t ce, so that u — U, and an obvious method is to try an exponen- 
tial decay to zero. Thus, in the one-dimensional form of the heat-conduction equation 


lgv |9*v 
K Ot ox 


we seek a separated solution of the special type discussed in Section 9.3.2, where the 
physics indicates a solution 


(9.39) 


v =e" w(x) 


Example 9.20 


Solution 
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which on substitution gives 

T RR dw 

w= 

K dx 

Letting @/«= 42, we can solve this simple-harmonic equation to give 


w = A sin Àx + B cos Àx 
and hence 
v = e "(Asin Ах + В соз Ах) (9.40) 


Taking the hint from Section 9.3.2, we expect in general to take sums of terms like (9.40) 
to satisfy all the boundary conditions. Thus we build up a solution 


v= > e "(Asin Ax * B cos Ax) (9.41) 


Solve the heat-conduction equation 97/0t — xo?T/2x? subject to the boundary conditions 
(a  T-0atx- O0 and for all t > 0 (held at zero temperature); 
(b) oT/dx =0 at x =/ and for all t > 0 (no heat loss from this end); 


(c) T-2T,sin(3nx/2I) at t 2 0 and for 0 < x </ (given initial temperature profile). 


We first note that as f — ee the solution will be T = 0, so the steady-state solution is 
zero, and so from (9.40) we consider a solution of the form 


Т=е °'(А їп Ах + В соз Ах) (9.42) 


In order to satisfy the boundary condition (a), it is clear that it is not possible to include 
the cosine term, so B — 0. To satisfy the condition (b) then requires 


OT . 4e" Acos Ax 0 (х=/) 
Ox 


so that 
cosAl=0, or Als(n*;m (n20,1,2,3,...) 


leading to the solution 


Т= Ае“ uL + рл 


We now compare the T from condition (c) with the solution just obtained at time ¢ = 0, 
giving 


-0 . xf 3 
Ae? sa} + рл = Т, sin( 32) 


The unknown parameters can now be identified as n = 1 and A = 7), and hence the final 
solution is 


2 
T= T,exy(- da | sin( 32) 
4l 21 
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Example 9.21 


Solution 


Solve the heat-conduction equation Qu/ot — ko?u/ax? subject to the boundary conditions 
(a  u(0,10)-20 (t20) (zero temperature at the end x — 0); 
(b  u(Lft)20 (t20) (zero temperature at the end x 2 /); 


(c) uQ0)-u(i-xl) (0 <х < 1) (a given initial temperature profile). 


We are solving the problem of heat conduction in a bar that is held at zero temperature 
at its ends and with a given initial temperature profile. 

It is clear that the final solution as t > © is u = 0, so that U = 0 is the steady-state 
solution, and hence from (9.40) we seek a solution of the form 


u=e“(AsinAx + Bcos Ax) 


subject to the given boundary conditions. The first of these conditions (a) gives B = 0, 
while the second condition (b) gives 


sinAl=0, or Al=nn (n=1,2,...) 


Recalling that A? = a/k, we find solutions of the form 
и дет? sin (2) (n21,2,...) 


Clearly we cannot satisfy (c) from a single solution, so, as indicated in (9.41), we revert 
to the sum 


u - Y A, ge m P un (m (9.43) 
п=1 


which is also a solution. 
Using the boundary condition (c), we then have 


me - z) = У А, sin( 272) 
n-l 


and hence, by (7.33), 


1 
LIA, = up (: - x) sia(22) 2 | (odd n) 
0 | І ugl/nt (even n) 


Note again that we have used a periodic extension of the given function to obtain a 
Fourier sine series valid over the interval 0 « x « /. Outside the interval 0 « x « / we 
have no physical interest in the solution. Substituting back into (9.43) gives as a final 
solution 


= 2» i e mmu sin (22) (9.44) 


m-l 


or, in an expanded form, 


u Aed _ 241 . A 24) . 
и= = e Aki sin( 24%) tle 16кт2г/1 sin( 42x) +le Sell in (=) o | 
T 


Figure 9.23 Solution 
of Example 9.21 with 
Т = 1(4кл2/1). 


Example 9.22 


Solution 
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ulug 


0.4 


0.2 


-0.2 


-0.4 





In Figure 9.23, u/uy is plotted against x// at successive times T= f(Akr?/I^) — 0, 0.5, 1, 1.5. 
Taking successive terms in the series for the values T= t(4K17/1’) = 0.5 and x/l = 0.2, 
we get 


| 1 term 2 terms 3 terms 4 terms 


ulug | 0.1836 0.1963 0.1956 0.1953 





and we see that three terms of this series would probably be sufficient to give three- 
figure accuracy. In all such problems some numerical experimentation is required to 
determine how many terms are required. For small t and x we should expect to need a 
large number of terms, since the temperature at the end switches from ; Uy to 0 at time 
t= 0. It is well known, as we saw in Chapter 7, that discontinuities cause convergence 
difficulties for Fourier series. 


It may be noted in Example 9.21 that the initial discontinuity is smoothed out, as we 
expected from the physical ideas that we outlined in Section 9.2.2. 


Solve the heat-conduction equation du/ot = Ko*u/Ox? in a bar subject to the boundary 
conditions 


(a) u(0,t)=0 (t=O) (the end x=0 is held at zero temperature); 
(D  u(1,) 1 (£20) (the endx= 1 is at temperature 1); 


(с) u(x,0)=x(2-—x) (O<x<1) (the initial temperature profile is given). 


First it is clear that the final steady-state solution is U = x, since this satisfies (a) and (b) 
and also V?U = 0. Secondly putting u = U + v, the new variable v satisfies (9.39), but 
now the boundary conditions on v are 
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9.4.2 


Example 9.23 


(a^) w0, ) 20 
(b^) v0, =0 
(c) v(x, 0) 2x(2- x) -bx2x- xi 
The appropriate solutions in (9.40) can now be selected; the condition (a’) gives B = 0, 
while the condition (b'), v(1, f) = 0, gives 
sinA-0, or A-2n (n=1,2,...) 
From (9.41) we then have 


=. —кп?т?ї = 
v= у a,e sin nx 
n-l 
and condition (c^) gives 
2 = . 
xx = У а, sin ATX 
n-l 


Determining the Fourier coefficient using (7.33), 


1 
la -| (x- x°) sin nx aofi (odd и) 
2 n 


0 0 (even n) 


Thus the complete solution is 


и=х+® Y A genet sin n - 1) mx 
п? (2n-1) 


or, in expanded form, 


T —9кт21 —25кт21 


sin3nx 4 d.e ѕіп 5лх+...] 


8 =K 2t à 1 
eue e sinTX t xe s 


Laplace transform method 


As we saw for the wave equation in Section 9.3.3, Laplace transforms provide an altern- 
ative method of solution for the heat-conduction or diffusion equation. The method has 
the merit of dealing with the boundary conditions easily, but it suffers from the usual 
difficulty of performing the final inversion. The following example serves to illustrate 
these points. 


Using the Laplace transform method, solve the diffusion equation 


2 
ди Ou 
—_ = E 


(all x, t > 0) 
ot 


given that u(x, t) remains bounded and satisfies the boundary conditions 


Solution 
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(a) u(x,0)=0 (all x) 
(b) u(a, t) = TO(t) 


The problem models an infinite pipe, coincident with the x axis, that is initially 
filled with clean fluid. It is subjected to an instantaneous burst of contaminant 
injected at the point x = a > 0; the concentration of the contaminant, as it diffuses, is 
required. 

Using (9.30) and (9.31) and taking Laplace transforms gives 


2 
sU(x, s) — u(x, 0) = cf UG s) 


X 
which, on using condition (a), leads to the ordinary differential equation 
2 
This is readily solved to give 
U(x, 5) = Ае'®®* + Ве! 
Since concentration remains bounded 
U(x, s) = Be for x > a and 
U(x, s) 2Ae6** forx «a 
Condition (b) then gives 
Ula, s) = SITS} = T = B e 
so that 
В = Те! and similarly A= T e" 
giving 
U(x, s) 2 Te * 999 for x > aand U(x, s) 2 Te ^99  forx <a 


To find the solution u(x, f), we must invert the Laplace transform. However, in 
this case, the methods discussed in Chapter 5 do not suffice, and it is necessary 
to resort to the use of the complex inversion integral, which is dealt with in 
specialist texts on Laplace transforms (see also Chapter 8, Review exercise 7). 
Alternatively, we can turn to the extensive tables that exist of Laplace transform 
pairs, to find that 


Фе _ z- p ent b>0 
үл 


We can then carry out the required inversion to give the solution 


_ 2 
uix = T = 32 eme (2 0) 
ү 
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It is possible to solve this example by using the extensive tables of Laplace trans- 
forms in MAPLE. It does, however, require some knowledge of the manipulative 
skills contained in the package to progress easily. 


It should be noted that the solution of Example 9.23 is not variable separable and could 
not be obtained from the methods of Section 9.4.1. 

To date all the problems studied have been restricted to heat flow in a rod. The ideas 
can be extended to spherical or cylindrical regions. Here one example will be considered 
that involves radially symmetric regions; see also Exercises 4 and 33. First the heat- 
conduction equation is written for a problem that only depends on the radial distance r 
from the origin and the time t. Now 


r=x +y +7 so бу ox 
ox 


To obtain the Laplacian 


yy- 2p ELLEI 
ду 


when f= f(r, t) evaluate 


of . ofar .xof 
ox orox ror 


and then the second derivative 


9T. 20:2) 197 _ LY. a xx df 
Q xvr) ror rr ar rv or 


The y and z derivatives follow in a similar manner so the Laplacian becomes 


V us So ty engem BEEN 


r r 


эга 
ror. or ror or 


The radially symmetric heat conduction equation for the temperature T(r, £) is 


therefore 
1 Of 20T\_ 10T 
BOO о але 4 
| Kk Ot p 


This radially symmetric form can be made to look very similar to the one-dimensional 
equation by writing 


T- (r, t) 


r 


With this substitution 


Example 9.24 


Solution 
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1 a9 4,98) 1.96 
2 0 +75?) = rk ot 


and differentiating once more 
l( 90 ,90 29) 190 
~(-22 +22 4,22) -2 
| dr or "o^ к ді 


Thus the equation for O(r, f) is just the usual one-dimensional heat-conduction 
equation 


2'0 196 


EA (9.46) 


Methods developed earlier can now be used on this equation, but note that care must be 
taken at the origin where r = 0 since 


т 000) 


r 


An example will illustrate some important points. 


Solve the radially symmetric heat conduction equation (9.45) 


(a) inthe region 0 « r < a subject to the boundary conditions T = T, on r = a and T 
is finite in the region. The initial condition is T(r, 0) = 0. 


(b) inthe region r > a subject to the boundary conditions T= T, on r = a and T tends 
to zero as r tends to infinity. The initial condition is T(r, 0) = 0. 


(a) To solve (9.45) put 
fap AED 
Р 


then O(r, f) satisfies (9.46) with the modified boundary conditions 0(a, г) = 0 апа 
O(r, t)/r 1s finite at the origin. Clearly the only separated solution in (9.40) that 
can satisfy these conditions is 


0-e*'sinÀr where a/k-A 
since sin Ar/r— À as r > 0. The condition at r = a gives sin(Aa) = 0 so Aa — nm 
where 7 is a positive integer. Summing all solutions of this type gives 

z pM 
Т=Т, Hy A,e " ""^ sin(ntr/a) 
r n=1 

The coefficients A, are given by the initial condition which reduces to the Fourier 
series problem 

“Ту = £ A, sin(ntr/a) 

п=1 


giving 
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(b) 


а 229141 
nn 
and finally 
2aT, дана; mug. 
T=T- е [e ба віп(ле/а) - le ^7? sin(2nr/a) +.. | 
r 


As in part (a) we look to previous methods of solution. First note that it is only 
necessary to solve (9.46) with the modified boundary conditions 0(a, f) 2 aT, and 
O(r, t) 1s finite for infinite r. The initial condition is O(r, 0) 2 0. The problem 15 
now very similar to Example 9.23 and Laplace transforms appears to be a sensible 
approach. The set up is precisely as in Example 9.23 (except for a change in the 
names of the parameters) to the point where 


TL 
sü(r, s) = KL OG) 
dr 
where it should be noted that the zero initial condition has been used. The solu- 
tion is 
O(r, 5) = Ae” tpe 6*9 


which is now subject to the condition that O(r, s) is finite as r gets large implying 
that A = 0. To calculate B, the transformed condition at r — a gives O(a, s) ^ aTy/s 
and hence 


B- аТо четда 


Thus the solution for the transformed equation is 


= ат куб 
Өқ», к) = 21967091900) 
S 


The expression requires either access to extensive tables of Laplace transforms or 
to the tables in MATLAB or MAPLE. They give 





O(r, t) 2 aT,erfc 5 - 2) 
ү Kt 


so that 


T(r, t) = S en (1 = 2) 
r 2 \ kt 





The error function 


2 | -v 
ег) = |e dv 
yx 
0 
and erfc(z) = 1 — erf(z) occur commonly in the solution of the heat-conduction 
equation (see also Section 7.6), and in statistics. It is well documented and appears 


as standard in MATLAB and MAPLE. The erfc function is illustrated in Figure 9.24 
and the solution 7/7, is plotted against r/a in Figure 9.25 for various times. 


Figure 9.24 The 
function erfc(z). 


Figure 9.25 Plot 

of the solution to 
Example 9.24 for times 
Axt/a’ = 0.01, 0.1, 1, 
10 and 100. 


32 
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erfc(z) 

















TIT, 











rla 
9.4.3 Exercises 
Find the solution to the equation SS 
7 ax” 


satisfying the conditions 


(a) ди/дх = 0 at x =0 for all t 

(b) uz 0atx- 1 for all t 

(с) wu=acos (mtx) cos ($x) forO<x <1 
when t 2 0 





The spherically symmetric form of the heat- 
conduction equation is 


2 I 
U,, + =u,=—U, 
r K 


By putting ru = v, show that v satisfies the standard 
one-dimensional heat-conduction equation. What 
can we expect of a solution as r — co? 

Solve the equation in the annulus a S r < b 
given that u(a, t) = T), u(b, t) = 0 for all t > 0 and 


778 


34 


35 


36 


ES 


38 


PARTIAL DIFFERENTIAL EQUATIONS 


the initial condition u(r, 0) 2 0 fora « r « b. Show 
that the solution has the form 


Dies xit Е 4 ve 
b-a 


where A(b — a) = No. Evaluate the Fourier 
coefficients Ay. 








г= 30 = 
r 


Show that u(x, f) 2 t^F(1]), where 1] = x?/t is a 
solution of the partial differential equation 


ot ax’ 
if F satisfies 


an LE 


T uc 
dn 39 


Find non-zero values of « and «x for which F = e” 
is a solution. 


Адат 


Show that u(x, t) = f(x) sin(x — 
the heat-conduction equation 


ðt aX 


provided that f and the constant fj are chosen 
suitably. For a semi-infinite slab of uniform 
material occupying the region x = 0 construct 

the solution that satisfies (a) u is bounded as 

x — œ and (b) u(0, f) = u sin 2t (that is, a periodic 
temperature is imposed at x = 0). 


Bt) is a solution of 


Show that the equation 
0, = KO. m h(0— ө) 


can be reduced to the standard heat-conduction 
equation by writing u — e"(0 — 0). How do you 
interpret the term A(0 — 0;)? 40 


Use separation of variables to obtain a solution to 
the heat-conduction equation du/dt = KO7u/dx’, 
given 


(a) du(0,f) dx 20 (7 2 0) 
(b) uf, 20 (72 0) 
(c) u(x, 0) = uy 


Compare the solution with that obtained in 
Example 9.21. 


-xl) (0mx«l) 


The voltage v at a time f at a distance x along an 
electric cable of length L with capacitance and 
resistance only, satisfies 


9v | 19v 


dx С K ðt 
Verify that a form of the solution appropriate to the 


conditions that v = vọ when x = 0, and v = 0 when 
X — L, for all values of f, is given by 


= к? 
= 1-E Je Ye exp [EET sin| nx 
І AM Е L 


where v, and the c, are constants. 
Show that if, in addition, v = 0 when t = 0 for 
SIAL 


2 vo 
пт 


с„=— 


п 


A uniform bar of length / has its ends maintained 
at a temperature of 0°C. Initially, the 
temperature at any point between the ends of 

the bar is 10 °C, and, after a time t, the temperature 
u(x, f) at a distance x from one end of the bar 
satisfies the one-dimensional heat-conduction 
equation 


ди _ 1ди 


с >) 
d) Kat i ) 


Write down boundary conditions for the bar and 
show that the solution corresponding to these 
conditions is 


u(x, t) = 


ay (1 -cosnm) ex (7 == n 3 


n=1 
x sin(x) 
1 
The function (x, f) satisfies the equation 


2 
9ó 99.4 
ot dx? 


(7h € x € h, t» 0) 


with the boundary conditions 


(а) ф(-^, ) = o(h, ) = 
(b) (0,0) =0 (-л<х <А) 


0 (t2 0) 


where a, b and Л are positive real constants. 
Show that the Laplace transform of the solution 


ф(х, f) is 


bh. cosh [(s/a) ^x J| 
s? 1/2 


cosh[(s/a) ^A] 


9.4.4 


j+l 


i-lii+l x 
Figure 9.26 Mesh for 
marching forward the 
solution of the heat- 
conduction equation. 


Example 9.25 


Solution 
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Numerical solution 


As for the wave equation, except for the most straightforward problems, we must resort 
to numerical solutions of the heat-conduction equation. Even when analytical solutions 
are known, they are not always easy to evaluate because of convergence difficulties near 
to singularities. They are, of course, crucial in testing the accuracy and efficiency of 
numerical methods. 

We can write the heat-conduction equation 


12u _ au 
xot ox 


in the usual finite-difference form, using the notation of Figure 9.26. 

We assume that we know the solution up to time step j and we wish to calculate the 
solution at time step j + 1. In Section 9.3.5 we showed how to approximate the second 
derivative as 


(9.47) 


ди и 1+ 1,7) = 2и(1, 3) +и(1- 1,7 
д? Ах? 
To obtain the time derivative, we use the approximation between rows j and j ^ 1: 
gu  u(i, j 1) - u(i, j 
д At 
Putting these into (9.47) gives 


u(i,j*l)-u(i,j) u(i-1,7)-2u(i, j) c u(i * 1, 7) 


2 


KAt Ax 
or, on rearranging, 
u(i, j - 1) 2 Au(i — 1,7) - (1 - 22)u(i, J) * Au(i + 1,7) (9.48) 


where A = KAt/Ax’. Equation (9.48) gives a finite-difference representation of (9.47), 
and provided that all the values are known on row j, we can then compute u on row j 1 
from the simple explicit formula (9.48). 

First a simple example on a coarse grid. 


Use an explicit numerical method to solve the heat-conduction equation (9.47) subject 
to the boundary conditions 


(a)  u(0,)24(0,1)20 (12 0) (both ends held at zero temperature); 
(b) u(x, 0)= sin(nx) (0x1) (a given initial temperature distribution). 


Use the parameters Af —2 0.1, Ax 2 0.25, x 2 0.1. 


This problem has the exact solution u = gt? sin(nx), so the accuracy of the numerical 
solution can be checked easily. 
At t= 0 (or j = 0) the initial values come from the boundary condition (b) 
x | 0 0.25 0.5 0.75 1 
u | 0 0.7071 1 0.7071 0 
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Example 9.26 


Solution 


Figure 9.27 Mesh 

for Example 9.26: 
u(7, 7) = 1 and 

u(0, j) — u(1, j) for all 
J; Ax = 1/6.5 = 0.1538. 


Before proceeding further we first note that A= 0.16. At t= 0.1 (or j= 1) equation (9.48) 
becomes 


и(ї, 1) = 0.16[и(7 – 1, 0) + ui + 1, 0)] + 0.68u(i, 0) 
which, on calculation for the values x = 0.25, 0.5, 0.75 or i= 1, 2, 3, gives 


x | 0 0.25 0.5 0.75 1 
u | 0 0.6408 0.9063 0.6048 0 





Note that at the ends, x = 0 and 1, the boundary condition (a) u = 0 is imposed. 
At t= 0.2 (or j= 2) 
u(i, 2) = 0.16[и(1— 1,1) + u(i + 1, 1)] + 0.68u(i, 1) 

and performing the calculations 


x | 0 0.25 0.5 0.75 1 
u | 0 0.5808 0.8213 0.5808 0 





Similarly for t 2 0.3 (or j 2 3) 





x | 0 0.25 0.5 0.75 1 

u 0 0.5263 0.7444 0.5263 0 

u exact 0 0.5259 0.7437 0.5259 0 
and so on. 


In the last table the exact values have been included for comparison. 


Solve the heat-conduction equation (9.47) subject to the boundary conditions 
(a) du(0, t)/dx=0 (t=0) (no heat flow through the end x = 0); 
(D  u(1,) 1 (t£20) (unit temperature held at x = 1); 


(с) u(x,0)=x? (0<x<1) (a given initial temperature distribution). 


To fit the condition (a) most easily, we allow the first mesh space to straddle the ¢ axis 
as illustrated in Figure 9.27, where six intervals are used in the x direction. The mesh 
implies that Ax = 1/6.5 = 0.1538; condition (a) gives u(0, j) = u(1, /) while condition 
(b) gives u(7, 7) = 1. 





Figure 9.28 
Numerical solution 
of Example 9.26 at 
time t = 20 Ax’/k, for 
two values of A. 
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Some simple MATLAB code produces the solution very quickly: 





I SINE PTT $values of the Example 
ЕО Бу (а) Е р $initial data 
у= [ИШ ш а= чыл [Г аш DE т M [E er ET PTS LS 

Sa en S (DET 


$computes the first row 
TC = ШОК ШЕ =з таа З Е та = ЛИЕ 
MODEN 
$repeat this line of code to obtain successive rows 


The following table gives values for u at the three times t = 0, 0.2Ax^/x and 20A x^/v, 
calculated with A = 0.2. These results are then compared at t 2 20Ax^/x, computed with 
À taken to be 0.5. It may be noted that there are errors in the third significant figure 
between the two cases. 


© 
— 
кә 
чө 
AR 


5 6 7 





j=0 0.0059 0.0059 0.0533 0.1479 0.2899 0.4793 0.7160 
A=0.2 4 j=1 0.0154 0.0154 0.0627 0.1574 0.2994 0.4888 0.7255 
J=100 | 0.6817 0.6817 0.7002 0.7362 0.7874 0.8510 0.9233 
А= 0.5 ј= 40 0.6850 0.6850 0.7033 0.7389 0.7896 0.8526 0.9241 


= = = = 





For A= 0.54 we can obtain a solution that compares with the solution given, but for 
A = 0.6 the solution diverges wildly. Figure 9.28 shows a plot of the solution near the 
critical value of À at a fixed time t = 20A x^/x. As in the table, the solutions are accurate 
to about 1% for small A, but when A gets much above 0.5, oscillations creep in and the 
solution is meaningless. In Figure 9.29 a further graph illustrates the development of 
the solution in the two cases A = 0.2 and A = 0.55. One solution progresses smoothly 
as time advances, while the other produces oscillations that will eventually lead to 
divergence. 
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(a) (b) 
Figure 9.29 Plots of u against x and t from the solution of Example 9.26 with (a) д = 0.2; (b) A = 0.55. 


Comparing Example 9.26 with the numerical solution of the wave equation under- 
taken in Example 9.18, we observe similar behaviour for the explicit scheme, namely 
that the method will only converge for small enough time steps or A. From (9.48) it 
may be noted that the middle term changes sign at A = 0.5, and above this value we 
might anticipate difficulties. Indeed, some straightforward numerical analysis shows 
that convergence is certain for A < 0.5. It is sufficient here to note that A must not be 
too large. 

To avoid the limitation on A, we can again look at an implicit formulation of the 
numerical equations. Returning to Figure 9.26, the idea is to approximate the x deriva- 
tive by an average of row j and row j + 1: 


u(i, 7+ 1)— ui, j) = AL — @[ui — 1, 7) — 2uG, 7) + ui + 1,7)] 
+ afuG -1,7+1)-2uG,7+ 1) c u(i * 1,j *- 1)]) 


where 0 € o x 1 is an averaging parameter. The case о = 0 corresponds to the explicit 
formulation (9.48), while a@ = 1 is the best known implicit formulation, and constitutes 
the Crank-Nicolson method. With a =}, we have 


—Au(i — l, j * 1) - 2(1 * A)u(i, j - 1) — Au(i + 1,7 + 1) 
= u(i — 1, j) + 2(1 — u(i, j) + Au(i + 1, j) (9.49) 


We know the solution on row j, so the right-hand side of (9.49) is known, and the 
unknowns on row (j + 1) have to be solved for simultaneously. 

Fortunately the system is tridiagonal, so the very rapid Thomas algorithm can be 
used. The method performs extremely well: it converges for all A, and is the best known 
approach to heat-conduction equations. 


Example 9.27 


Solution 
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Repeat Example 9.25 but with an implicit scheme. 


At t — 0 (or j — 0) the initial values come from the boundary condition and are identical 
for both the implicit and explicit formulations. 
At £— 0.1 (orj — 1) the three equations of the type (9.49) corresponding to x = 0.25, 

0.5, 0.75 or i 2 1, 2, 3 are 

—0.16u(0, 1) + 2.32u(1, 1) — 0.16u(2, 1) = 0.16[u(0, 0) + u(2, 0)] + 1.68u(1, 0) 

—0.16u(1, 1) + 2.32u(2, 1) — 0.16u(3, 1) = 0.16[u(1, 0) + u(3, 0)] + 1.68u(2, 0) 

—0.16u(2, 1) + 2.32u(3, 1) — 0.16u(4, 1) = 0.16[u(2, 0) + u(4, 0)] + 1.682(3, 0) 
After noting that the end boundary conditions give 


u(0, 0) = (0, 1) = u(4, 0) = u(4, 1) = 0 


and the right-hand sides evaluated from the initial values, the equations can be written 
in matrix form as 


232 -016 0 |[lua, D] [1.348 
-0.16 232 -0.16||u(2,1)| 7 | 1.906 
0  -016 232||u(3,1)| |1348 


The tridiagonal system can be solved to give 


x | 0 0.25 0.5 0.75 1 





u | 0 0.6438 0.9105 0.6438 0 


For the next time steps the matrix equation is identical, with the j-suffix advanced by 1 
at each time step and the right-hand sides re-evaluated from the most recently computed 
values of u. Subsequent values are 





x | 0 0.25 0.5 0.75 1 
uatt=0.2 0 0.5862 0.829 0.5862 0 
uatt- 03 0 0.5337 0.7547 0.5337 0 


and should be compared with the explicit solution in Example 9.25. 
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Example 9.28 


Solution 


E] 


Figure 9.30 Solution 
of Example 9.28 with 
Ах = 0.1, 5 =0.3 using 
the Crank-Nicolson 
scheme. 


Solve the heat-conduction equation (9.47) using the implicit formulation (9.49) for the 
boundary conditions 


(а) и(0, 1) =0 (72 0) (еепах = 0 15 kept at zero temperature); 
(b) u(l,t)=1 (72 0) (theendx- 0 is kept at unit temperature); 


(c) u(x, 0)=0 (OSx<1) (initially the bar has zero temperature). 


Here a bar is initially at zero temperature. At one end the temperature is raised to the 
value | and kept at that value. 


The matrix inversions are very tedious to perform, so again a package such as 
MATLAB solves the equations very quickly; note the use of the ‘colon’ notation. 
(lets ODay ae — 2a (ile In) eae Nee (lei) Mr TNT Ec res SS SM eNe Example 
Wen~eros (i,m) "2 Win); $initial data 
TESTE MIETEN Ave оит ЕАИС ЕЕ епа 
а= М 1]; В=еуе(п); Ғор 1=2:п-1, В(1,1-1:1+1)-=а; епа 





$sets up the matrices in equation (9.49) 
DDZ INV (ANEB yV DDN $solves for first row 
u-v; v-DD*u; %repeat this line of code for subsequent rows 


The results of the calculation are presented in Figure 9.30. At time step 0 the 
temperature distribution is discontinuous. The successive time steps 1, 10, 100 are 
shown, and the final distribution u — x is labelled ee. 
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9.4.5 Exercises 


41 Derive the usual explicit finite-difference 43 Given that u satisfies the equation 
representation of the equation 
EU v ШШ а 
ди _ Ty ot ox 
ot ox . . © 
. . . and is subject to the boundary conditions 
Using this scheme with At = 0.02 and Ax = 0.2, 
determine an approximate solution of the equation ди =1 (=0,t>0) 
at t= 0.06, given that ox р 
и=х? when t=0 (0 =х 1) и=0 (х= 1,12 0) 
u=0 when х=0 (72 0) u-x(l-x) (r20,0x«l) 
u-l when x=1 (¢>0) derive a set of algebraic equations from the 
a Е . implicit formulation in Section 9.4.4. Use the 
42 Use both explicit and implicit numerical implicit method by adapting the MATLAB 


segment in Example 9.28. Find the solution at 
t = 0.02 and 0.04 using the values Ax = 0.2 and 
At=0.02. 


formulations to obtain solutions of the heat- 
T) conduction equation subject to the boundary 
conditions 


(а) и(0,)=0 (720) (b) ul, =e” (t=0) 
(c) u(x,0)=0 (0<x<1) 


Compare the two results for ¢ = 1. 


Solution of the Laplace equation 


In this section we consider methods of solving the Laplace equation introduced in 
Section 9.2.3. 


9.5.1 Separated solutions 


It is much less obvious how to construct separated solutions for the Laplace equation, 
since there is less physical feel for the behaviour except that the solution will be 
smooth. We shall therefore work more formally, as in Section 9.3.2, and seek a solution 
of the Laplace equation 


2 2 

диди 0 (9.50) 

Ox ду 
in the form 

u = Х(х)Ү( у) 
which gives on substitution 

2 2 

үл куб м4 

dx dy 


or 
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Example 9.29 


Solution 


2 2 
E yccliyag (9.51) 
dx? ау? 


Now (d?X/dx?)/X is a function of x only and —(d?Y/dy?)/Y is a function of y only. Since 
they must be equal for all x and y, both sides of (9.51) must be a constant, A. We 
therefore obtain two equations of simple-harmonic type 


2 2 
Хах, Ф лу 


ах dy? 

The type of solution depends on the sign of A, and we have a variety of possible solutions: 
A=- <0: u=(Asinux + Bcosux)(Ce” + De) (9.52a) 
Az=w>0: u=(4e" + Be™)(Csinuy + Dcos uy) (9.52b) 
A=0: и = (Ах + В)(Су + Р) (9.52с) 


where A, B, C and D are arbitrary constants. Using the definitions of the hyperbolic 
functions, it is sometimes more convenient to express the solution (9.52a) as 


u = (A sin ux + B cos ux)(C cosh uy + D sinh uy) (9.52d) 
and (9.52b) as 
u = (A sinh ux + B cosh ux)(C sin uy + D cos uy) (9.52e) 


The actual form of the solution depends on the problem in hand, as illustrated in the 
following examples. 


Use the separated solutions (9.52) of the Laplace equation to find the solution to (9.50) 
satisfying the boundary conditions 


u(x, 0)=0 (0<х<2) 
u(x, 1)=0 (0<х<2) 
u(0,y)=0 (0<у<1) 
и(2, у) = аѕіп2лу (0<y< 1) 


To satisfy the first two conditions, we need to choose the separated solutions that 
include the sin uy terms. Thus we take solution (9.52b) 


и= (Ае + Be™)(Csin uy + Dcos uy) 
The first boundary condition gives 
(Ae +Be")D=0 (0<x<2) 
so that D = 0. Thus 
u=(A’e"* + Bee) sin uy 
where A’ = AC and B’ = BC. The second boundary condition then gives 
(4’e"*+ Be) sinu=0 (0<x <2) 


Example 9.30 


Solution 
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so that sinu = 0, or u = nt with n as integer. Thus 
u — (A'e"* 4 B'e"")sinmuy 
From the third boundary condition, 
(4 + В’)ѕіпилу= 0 (0<y< 1) 
so that B’ = —4’, giving 
u = A'(e"* — e"")sin muy 
— 24' sinh nx sin nmy 
The final boundary condition then gives 
24’ sinh 2nt sinnty=asin2my (0<y< 1) 


We must therefore choose n = 2, and a = 24’ sinh 2nn = 24’ sinh 4n, or 2A’ = a/sinh 4n. 
The solution is therefore 


sinh 27x 


и = аѕіп2п7 
? sinh 4n 


Solve the Laplace equation (9.50) for steady heat conduction in the semi-infinite region 
0 <y <1,x = 0 and subject to the boundary conditions 


(a) u(x,0)=0 (x20) 

(b) u(x,1)=0 (xz0) / (temperature kept at zero on two sides and at infinity); 
(c) u(x, y) 790 аѕхә о 

(d 


— 


u(0,y)21 (0<yS<1) (unit temperature on the fourth side). 


Clearly from condition (c) we need a solution that is exponential in x, so we take 
(9.52b): 


u = (A e" + B e*™)(Csin uy + D cos uy) 
and since the solution must tend to zero as x — ce, we have A = 0, giving 
u — e""(C' sin uy + D’ cos uy) 


where C’ = BC and D’ = BD. Condition (a) then gives D’ = 0, and (b) gives sin u = 0, 
or L2 nx (n 2 1,2,...), so the solution becomes u = C'e""sinnny (n2 1,2, ...). 
Because of the linearity of the Laplace equation, we sum over n to obtain the more 
general solution 


и = X Сует" sin my 


п=1 


Condition (d) then gives, as before, a classic Fourier series problem 


1 EE С'ѕіпилу (0=<у 1) 
п=1 


788 PARTIAL DIFFERENTIAL EQUATIONS 


Figure 9.31 Solution u 
ofthe Laplace equation 
in Example 9.30. 


so that, using (7.33), 


1 
c.=2| sinnayay = | (odd n) 


б 0 (еуеп л) 


The complete solution is therefore 
4 = 1 -(2n-l)nx . 
=- y —— e sin(2n — 1)n 
"OR 2 21 1 ола. 


or, in expanded form, 


п т. т. 


и= fre “sin ty + ie” ^ш Злу + le” * sin Sny +...) 

T 
In Figure 9.31 the solution u(x, f) is plotted in the (x, y) plane. Because of the discon- 
tinuity at x = 0, for x = 0.05 thirty terms of the series were required to compute u to 


four-figure accuracy, while for x = 1, one or two terms were quite sufficient. 


It is clear from Example 9.30 that the solutions (9.52) can only be used for rectangular 
regions. For various cylindrical and spherically symmetric regions separated solutions can 
be constructed, but they need more complicated Bessel and Legendre functions. The solu- 
tions have the same structure as for rectangular regions and follow the general theory 
of orthogonal functions discussed in Section 7.7.2. For instance the study of Legendre 
polynomials is required for problems similar to Example 9.24 when angular dependence 
is included. There is great merit in calculating exact solutions where we can, since they 
give significant insight. However, with modern computing techniques it is certainly not 
necessarily quicker than a straight numerical solution. 


Example 9.31 ^ Solve the Laplace equation (9.50) in the region 0 < x < 1,0 <y <2 with the conditions 
(a) щ(х, 0) =х (OSx<1) 
(b u(x2)20 (0xxxl) 
() w0,»)-0 (0xyx2) 
(d du(l,y)/ox=0 (Oy S 2) 


Solution 





Figure 9.32 

Region and boundary 
conditions for 
Example 9.31. 
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The steady heat-conduction interpretation of this problem, looking at Figure 9.32, gives a 
zero temperature on ABC, an insulated boundary on CD and a linear temperature on AD. 

Of the solutions (9.52), we require zeros on AB and zero derivative on CD, so we 
might expect to use trigonometric solutions in the x direction and exponential (or 
equivalently sinh and cosh) solutions in the y direction. We therefore take a solution of 
the form (9.524): 


и = (A sin Ux + B cos ux)(C cosh uy + D sinh uy) 
From condition (c), we must take B = 0, giving 
u = (C’ cosh uy + D' sinh uy) sin ux 
where C’ = AC and D’ = AD. Condition (d) then gives cos u = 0 or 
u- (nim (20. 1.2, 5) 
so the solution becomes 
u — [C' cosh(n + I yny *- D'sinh(n *- i)ny]sin(n - 1) (n-20,1,2,...) (9.53) 
To satisfy condition (b), it is best to rewrite (9.53) in the equivalent form 
и = sin[( * 1 )nx](E cosh[( * 1 n(2 — y)] 
+ Fsinh[(n +})n(2-y)]}} (120,1,2,...) 
We see that (b) now implies E = 0, so that our basic solution, summed over all n, is 
и= У E, sin[(n + irx] sinh[ (7 + Dx - y)] 
n=0 


The final condition (a) then gives the standard Fourier series problem 


x= Y F, sinh[(2n + 1) x] sin[(n +!) nx] 
n=0 


so that, using (7.33), 
1 


iR, sinh(2n + l)n = | x sin[(n + 1)лх]ах = 
4 2 1\2 
0 x (n*;) 


Н 1 
sin(n * 5)7 


The solution in expanded form is therefore 
8 | . ,  Sinhin(2-y)  sininxsinh?x(2 — y) 
sin CO 


и = 73 5 TX Я : 
sinh 1 9 sinh 3x 


+ sin nx 


sinh 2n (2 - y) x 
25 sinh 57 m 


Curiously, Laplace transform solutions are not natural for the Laplace equation, since 
there is no obvious semi-infinite parameter. Even in cases like Example 9.30, where we 
have a semi-infinite region, the Laplace transform in x requires information that is not 
available, see Section 9.8.2. 
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Example 9.32 


Solution 


Figure 9.33 
Complex-variable 
solution to the 
Laplace equation 
in Example 9.32, 
showing the 
streamlines of 
flow into a corner. 


Another technique for solution of the Laplace equation involves complex variables and 
was discussed in Section 4.3.2. It is a method that was very widely used in aerodynamics 
and in electrostatic problems. Since the advent of modern computers, with highly efficient 
Laplace solvers, the method has fallen somewhat into disuse. It was the cornerstone of 
all early flight calculations and, for those interested either in the historical context or in 
the beautiful mathematical theory, the study of complex-variable solutions is essential. 
The real and imaginary parts of a differentiable function f(z) of the complex variable 
z = x + jy automatically satisfy the Laplace equation. Example 9.6 showed a solution 
that was interpreted as the flow past a cylinder; the function was obtained from the 
imaginary part of 


юе 


It is then possible to use the Kutta-Joukowski transformation to transform the circle to 
an aerofoil shape and the lift and drag on the aerofoil can be computed. Example 9.32 
illustrates a much simpler situation. 


If f(z) = ф(х, у) + jy(x, y) is a complex function of the complex variable z = x + jy, 
verify that @ and y satisfy the Laplace equation for the case f(z) = z’. Sketch the 
contours of @ = constant and y= constant. 


Now 
Ра) = 2? = (x? – у?) +ј2ху 

and thus 
$-x'-yM  w=2xy 

It is trivial to differentiate these functions, and both clearly satisfy the Laplace equation: 
V’o=0, VWy=0 

The contours of @ and vy are plotted in Figure 9.33. They are both hyperbolas, which 


intersect at right angles. The usual interpretation of these solutions is as irrotational 
inviscid fluid flow into a corner. 


y y = constant ф = constant 





Example 9.33 


Solution 
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If we now try to solve the Poisson equation (9.10), which is an extension of the 


Laplace equation with heat sources/sinks f(x, y) on the right-hand side 
ди ‚ди 
a ke A) 
ax” ду 


the problem becomes harder as illustrated in Example 9.33. 


Solve the Poisson equation 


2 2 
a eu = -sin™ cos™ 
д ду a b 
in the rectangle 0 € x « a, 0 « y « b given the boundary conditions 


и= О опх= 0, и= (у) опх=а 


ди =0 опу= 0 апіу= Б 
ду 


Physically the problem can be interpreted as a heated plate with the temperature 
specified on the two boundaries x = 0 and x = a and with insulated boundaries y = 0 
апа у = b. 

The general strategy is to find a ‘particular integral’ to eliminate the term on the 
right-hand side, compute the new boundary conditions and then solve the residual 
Laplace equation. In the present case choose 


U= Ksin Œ cos © 
a b 


Substitute into the Poisson equation to give 


2 2 
V U= -к(®; +^ ) sin cos? 
a a 


p b 
and hence 
1 af 1 1 
K` = T Е £2) 
Now put 
u=U+v 


so that v satisfies the Laplace equation 
Vv=0 
and the boundary conditions remain the same 


v=Oonx=0, v=f(y)onx=a 


до =0 опу= О апау= 2 
ду 
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We are now back to a standard Laplace equation problem that can be solved by separation 
of variables. From the solutions (9.52) choose 


I 
=гА 
v= 54o 
>= A, sinh cos n= 1, 2,3,.... 
a 


These solutions satisfy three of the boundary conditions, just leaving 


v=f(y) onx=a 


to be satisfied. The usual infinite sum of terms is constructed 


al V A sinh cos ТУ 
v — ACE Y Assinh 4 00875 


n-l 
so that on x = a the remaining boundary condition gives the usual Fourier cosine series 
problem 
1 ч ; пту 
= 4а + A, sinhnm) cos 
fi (у ) 2 0 > ( п ) Ь 


In Chapter 7 (7.30) and (7.31) give 


b 


dy = Apa = ло 


b 


an a, 2 ronem (Pa nz1,2,3,... 


0 


The final solution is given by 


sins = X cos 





ET = а s ATX ATU 
x+% n sinh 4 cos 


S 
"A5 +! yee 2, nhan а Ь 
a 


The method of solution described in Example 9.33 can be quite difficult and clumsy. 
Finding the ‘particular integral’, U, is not always easy even for simple right-hand sides. 
If U can be found, the new boundary conditions on v can become very awkward, 
often further substitutions need to be made to bring the problem to tractable form. In 
Example 9.33 the right-hand side was carefully chosen to avoid this extra difficulty. An 
alternative method using Green’s functions will be considered in Section 9.7.2. The 
solution turns out to be very neat with the right-hand side and the boundary conditions 
appearing naturally in various integrals. However, although neat, the computation of 
the Green’s function is just as difficult as the method described in Example 9.33. 


44 


45 


46 


47 
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9.5.2 Exercises 


Use the separated solutions (9.52) to solve the 
Laplace equation 


2 2 
oe 48 
in the region 0 € x < 1, y 7 0 given the boundary 
conditions 
(a) u=00nx=0andx=1 (y>0) 
(b) u > 0 as y > œ (0 x x « 1I) 
(с) и= ѕіп(лх) опу= 0 (0 =х% 1) 
(Note: the identity sin°@ = 4 (sin5@— 5 sin30+ 
10 sin 0).) 
Show that the function u(x, y) = e "?[ y cos(ny/2) 
— xsin(ry/2)] satisfies the Laplace equation 49 
2 2 
a = E 
дх ду 
and the boundary conditions 
(а) и=0опу=0 (х > 0) 
(b) u=-xe™ ony=1 (х > 0) 
(с) и= усоѕ(лу/2) опх= 0 (0<=у=1) 
(d) u — 0 as x ee 
Show that the function ó(x, y) 7 x?y satisfies the 
Poisson equation 50 
Fo д^ 
а. 
дх ду 
By putting  — u 4 xy, show that u satisfies the 51 


Laplace equation. Find the solution for @ in the unit 
square which satisfies the boundary conditions 


ф(х, 0) = 0 


а. ох 
ф(х, 1) = х + sin zx 


000, у) = 0 


ао = 
$1, y) 7 y 


Show that 
u(r, 8) = Br" sinn@ 


satisfies the Laplace equation in polar coordinates, 


1 1 
и. + и, + [Ugg = 0 
r r 


Determine u that is both finite for r « a and periodic 
in 9, given that 


u(a, Ө) = sin’@ = 3 sin 0 — 1sin30 
Verify that 


x+y -1 
v= 


u=- — 
x+y 2x41 


x + у + 2х +1 
both satisfy the Laplace equation, and sketch the 
curves u — constant and v = constant. Show that 


u tjv = 1—1) 


2+1 


where z = x + jy. 


(Hadamard example) Show that the Laplace 
equation 


д?ш!дх? + д?ш/ду? = 0 


with u(0, y) 2 0, (0, y) 2 (1/n) sinny (n > 0) has 
the solution 


u(x, y) = 4 sinh nx sin ny 
n 


Compare this solution, for large n, with the solution 
to the ‘neighbouring’ problem, when u(0, y) = 0, 
(0, y) = 0, and the solution u(x, y) = 0. 


Solve the Laplace equation 0°u/dx? + 0°u/dy? = 0 
іп the region 0 <x < 1,0 <y< 1 subject 

to the boundary conditions u(0, y) = 0, u(x, 0) = 0, 
u(1l, y) = 1, u(x, 1) = 1 by separation methods. 


A long bar of square cross-section 0 <x <a, 

0 <y <a has the faces x = 0, x = a and y = 0 
maintained at zero temperature, and the face 

y =a ata control temperature ug. Under steady-state 
conditions the temperature u(x, y) at a point in a 
cross-section satisfies the Laplace equation 


dx ду 
Write down the boundary conditions for u(x, y), and 
hence show that u(x, y) is given by 


ues нау, cosech(2n + 1)т 


2n+1 


2 2 
д u L2 =0 


n=0 


x sinh (2 i vm sin (Qn + ТЗ 
а а 
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52 


S 


Heat is flowing steadily in a metal plate whose 
shape is an infinite rectangle occupying the 
region —a « x « a, y > 0 of the (x, y) plane. The 
temperature at the point (x, y) is denoted by u(x, y). 
The sides x = +a are insulated, the temperature 
approaches zero as y — ee, while the side 
у = 0 15 maintained at a fixed temperature —T for 
—a « x « 0 and T for 0 < x <a. It is known that 
u(x, y) satisfies the Laplace equation 
du du 
дх ду 


and the boundary conditions 


=0 


(a) u>0asy>oforallxin-a<x<a 
(b) du/dx = 0 when x = ta 
-T (-a<x<0) 


(c) ie 
T (0<х<а) 


Using the method of separation, obtain the solution 
u(x, y) in the form 


are d |y 
КЫз КП 
x LL + H 
a 


A thin semicircular plate of radius a has its bounding 
diameter kept at zero temperature and its curved 
boundary at a constant temperature 7,. The steady- 
state temperature 7(r, 0) at a point having polar 
coordinates (r, 0), referred to the centre of the 
circle as origin, is given by the Laplace equation 


2 2 
or ror ғ 90 


9.5.3 Numerical solution 


54 


Assuming a separated solution of the form 
T- R(r)e(a) 
show that 


2п+1 


_ 4To © (r/a) ; 
T(r, 0) = т 2. Sd sin(2n + 1)0 


The Laplace equation in spherical polar coordinates 
(r, 0, @) takes the form 


2 (227). 1 2 (as o2?) 


дг\ dr) sin090 00 
2 
ѕіп Ө Әф 
If Vis only a function of r and 0, and V takes the form 
V-R(r)(x, where x-cos0 
show that 
d(2dR)\_ 
i( an) =k(k+1)R 


2 
(т) 2x € «ie Dy - 0 
dx dx 


where k is a constant. 

The function V satisfies the Laplace equation in 
the region a Sr < b. Onr =a, V = 0 and on 
r=b,V=asin’@, where ois a constant. Given that 
solutions for y are 


1 (k=0) 
у=4х (= 1) 
1(3х? 1) (к= 2) 


find V throughout the region. 


Of the three classical partial differential equations, the Laplace equation proves to be 
the most difficult to solve. The other two have a natural time variable in them, and it is 
possible, with a little care, to march forward either by a simple explicit method or by an 
implicit procedure. In the case of the Laplace equation, information is given around the 
whole of the boundary of the solution region, so the field variables at a// mesh points 
must be solved simultaneously. This in turn leads to a solution by matrix inversion. 
The usual numerical approximation for the partial derivatives, discussed in Section 9.3.5, 
are employed, so that the equation 


2 2 
ол. ои) 
Ox oy 


(9.54) 


Figure 9.34 Five- 
point computational 
module for the Laplace 
equation. 


Example 9.34 


Solution 
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at a typical point, illustrated in Figure 9.34, becomes 


u(i*1,j) -2u(i,]) - u(i— l,j + Ч 1+1) – 2u(i, 7) + u(i, j— 1 -0 
Ax? ду 
For the case Ах = Ay rearranging gives 
4и (1,1) = и(1+ 1,3) +и(- 1,3) +и(, у 1) u(i, j — 1) (9.55) 


In the typical five-point module (9.55) the increments Ax and Ay are taken to be the 
same and it is noted that the middle value u(i, j) is the average of its four neighbours. 
This corresponds to the absence of ‘hot spots’. We now examine how (9.55) can be 
implemented. 





Solve the Laplace equation (9.54) in the square region 0 < x < 1,0 « y « I with the 
boundary conditions 


(a) и=0опх=0 (b) u=1onx=1 


(c) u=00ny=0 (d) u=00ny=1 


For a first solution we take the simplest mesh, illustrated in Figure 9.35(a), which 
contains only four interior points labelled u, u;, u4 and u,. The four equations obtained 
from (9.55) are 


4u, = 0 + 0 + u, + u, 
4u, = 0 + 1 + uz +u 
4u; = 1 + 0 + u, + u 
4u, = 0 + 0 + u, + u 
which in turn can be written in matrix form as 
4 -l 0 = ш, 

-] 4 -I OJ |u 


0-1 4 lla 
-1 0 -1 41 и 


о н н о 


796 PARTIAL DIFFERENTIAL EQUATIONS 








О 0.333 0.667 1000 * 


(a) 


Figure 9.35 Meshes for the solution of the Laplace equation in Example 9.34: (a) a simple mesh containing 4 interior 
points; (b) a larger mesh with 49 interior points. 





This has the solution u, = 0.125, u, = 0.375, u, = 0.375, и, = 0.125. A larger mesh 
obtained by dividing the sides up into eight equal parts is indicated in Figure 9.35(b). 
The equations now take the form 


4u(1, 1) =u(2, 1) - 0 + и(1, 2) +0 


4102, 1) = u(3, 1) + u(1, 1) + u(2, 2) +0 


4u(7, 1) 2 1 + u(6, 1) + u(7, 2) + 0 


Au(1, 2) = u(2, 2) +0 +u(1, 3) + u(1, 1) 


Au(2, 2) = u(3, 2) + u(1, 2) + u(2, 3) + u(2, 1) 


4u(7, 2) = 1 + u(6, 2) + u(7, 3) + u(7, 1) 


We thus generate 49 linear equations in 49 unknowns, which can be solved by any 
convenient matrix inverter. The matrices take the block form 


4 -1 0 0 0 0 0 

-1 4 -1 0 0 0 0 

0 -1 4 -1 0 0 0 

A=| 0 0 -1 4 -1 0 0 
0 0 0-1 4 -1 0 

0 0 0 0-1 4 -I 

0 0 0 0 0-1 4 


Figure 9.36 

The solution of 
Example 9.34. 

The solution is 
symmetric about the 
line j = 4; the solution 
with Ax = 0.125 is 
given in the upper half 
and the solution with 
Ax = 0.25 is shown 
in parentheses in the 
lower half. 





a) 


со о со © oc c 


О; = 


ооо оноо оь 
| 
ооо но о о 
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so that the equations become 


A 


Фо о о о о Uu 


о о о о © р р 


cco WP Wo 


csccoW RP Won 
SWPP шо о о 


0 

0 

0 

0 

-1 

0 

0 

u(1, k) 
u(2, k) 
u(7, k) 
0 0 
0 0 
0 0 
0 0 
B 0 
A B 
B A 


© = coc cco 


U, 
U, 
U; 


— occ coco 


U, = 


U; 
Us 
U; 


сусу сусу су 


e 


(9.56) 


The matrix equation (9.56) can be solved by an elimination technique or an iterative method 
like successive over-relaxation (SOR). As indicated in Section 5.5.4 of Modern Engineer- 
ing Mathematics, SOR is the simplest to program, and elimination techniques are best 
performed by a package from a computer library. For the current problem we present the 
solution in Figure 9.36, where the cases Ax = : and Ax = 1 are both shown. It may be seen 
from this example that the accuracy of the solution is quite tolerable when the cases Ax — 1 





j values 
8.0 0 0 0 0 0 0 0 1 
7 |0 0.017 0.038 0.064 0.103 0.164 0.269 0.483 1 
610 0.032 0.069 0.117 0.184 0.282 0.431 0.661 1 
5 |0 0.042 0.089 0.150 0.233 0.350 0.512 0.731 1 
410 0.045 0.096 0.162 0.250 0.371 0.536 0.749 1 

(0.098) (0.250) (0.527) 

3 
2 |4) (0.071) (0.188) (0.429) 1 

1 
010 0 0 0 1 
0 | 2 3 4 5 6 7 8 





i values 
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2 


x 


i--li-0i-21liz2 
Figure 9.37 
Fictitious nodes, i=—1, 
introduced outside 

the boundary. 





апа Ах = i are compared. Note the averaging behaviour of the Laplace equation and 
observe that the discontinuity in the corner does not spread into the solution. The corner 
nodes are never used in the numerical calculation, so the discontinuity 1s avoided. 


The solution of the Laplace equation is often required so packages like MATLAB have 
the machinery to set up the solution for simple regions. For the 9 x 9 problem, with 
49 unknowns, the code is listed; note that the MATLAB numbering of the nodes is 
different from the text. 


G-numgrid(^S',9) $ sets up the numbering for a 9 x 9 square 
A-delsq(G); $ stored as a sparse matrix 

ash cose c UOS Ee RIO ETIN Enc сопре rh 
ANrhs $ gives the quoted solution 


Because of its simplicity, SOR is an attractive method for solving Laplace-type prob- 
lems. Equations (9.55) are rewritten with an iteration superscript as 


u"* (i, j) 2 u"(i, j) * iw[u"(i +1,j)+u”(i—1,j)+u”(i, j+ 1) 
tu'(i,j—1)-4u'(i,j)] (9.57) 


and w is a relaxation factor, discussed in Chapter 5 of Modern Engineering Mathematics. 

Knowing all the u(i, 7) at iteration n, we can use (9.57) to evaluate u(i, 7) at iteration 
n+ 1. Normally the u(i, /) are over-written in the computer as they are computed, so that 
some of the ns in the right-hand side of (9.57) become (n + 1)s. The order of evaluation 
of the is and js in (9.57) is critical, but the most obvious methods by rows or columns 
prove to be satisfactory. 

A great deal is known about the optimum relaxation factor w. It is closely related to 
the value of the maximum eigenvalue of the matrix associated with the problem. For 
square regions with unit side and with u given on the boundary and equal mesh spacing 
it can be shown that w = 2/(1 + sin Ax) is the best value. For other problems this is 
usually used as a starting guess, but numerical experimentation is required to determine 
an optimum or near-optimum value. 

We have only considered u to be given on the boundary, and it is essential to know 
how to deal with derivative boundary conditions, since these are very common. Let us 
consider a typical example: 


ди = о(у) опх= 0 
Ox 


We then insert a fictitious line of nodes, as shown in Figure 9.37. Approximately, the 
boundary condition gives 


u(1, j) - u(-1,j) 7 2(y)2Ax 
so that 
u(-1, j) = (1, j) - 2Axg(y) (9.58) 


Equations (9.55) or (9.57) are now solved for i = 0 as well as i > 0, but at the end of a 
sweep u(—1, 7) will be updated via (9.58). 
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Example 9.35 Solve the Laplace equation (9.54) for steady-state heat conduction in the unit square, 
given that 


(a) дшдх= 5 – у, х= 0 (steady heat supply on this boundary); 
(b) и= 0опх= 1, у= 0, у= 1 (zero temperature on the other three sides). 


Use Ax e Ay s 1. 


Figure 9.38 The mesh 
for Example 9.35. 























Solution Labelling the six unknown values u, u), . . . , u; as shown in Figure 9.38, equation (9.55) 
gives 


4u,— 0t wu tus 

4u, = Q + u usu, 

4u; = 0 + 0 + us + u 

4и, = u; + us + 0 + us 

4и; = и +и +0 + иц 

4и = u; + 0 + 0 + u; 
The values u ; and u ; are evaluated from boundary condition (a) 

u-us-22h(l-3)-2-1, w-u5-22h(1-1)-1 
so the equations become 

4u, = 2u, + u4 — ; 

4u, = u, + U3 + Us 

4и, = и + Us 

4u, = и + 2и; + ; 

4us = uy + Uy + Us 

4ug — us * us 


Thus there are six linear equations in six unknowns, which can be solved by any con- 
venient method. For instance, SOR as suggested in (9.57) gives the set of equations 
with iteration counter n and relaxation factor w 
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Example 9.36 


Solution 


п+1 п п п 1 п 

uj = щш + (2и + Ug — 5 — 4ш) 
+1 +1 

u, =u, + (ир + из + и – 4и) 
п+1 п yy ntl n n 

из =из+ (и; + и6— 4из) 
+1 +1 

uy = и + (и +2и;+;— 4и) 
ntl n ntl п+1 п п 

иѕ = из + i(uy cau, aug — Aus) 
ntl n yy, ntl п+1 п 

иб = и + $(uy +u; —– 4и) 


The equations are diagonally dominant so the iterations converge quickly; six significant 
figures are obtained in 11 iterations with w = 1 and at near optimum w = 1.2 in 8 
iterations. 


и, Uy и» U4 и; Us 





—0.02424 —0.005 05 —0.001 01 0.024 24 0.005 05 0.001 01 
—0.03121 —0.005 17 —0.000 77 0.03121 0.005 17 0.000 77 
—0.03357 . —0.00522 —0.000 68 0.033 57 0.005 22 0.000 68 


The expected symmetry is observed from the solution, physically heat is supplied to 


the bottom half of the left-hand boundary and an equal amount is extracted from the 


top half. For comparison of the accuracy, the calculations with h = } and + have been 


6 
included in the table. 


Solve the Poisson equation 
ди уди __| 
dX ду 
for steady heat conduction with a heating term, in the unit square given that 
(а) дидх= 0 опх= 0 (insulated along this boundary); 
(b) uy onx= 1 
(c) u=0 ony=0p (temperature given on three sides). 


(d) и= х опу = 1 


The region is illustrated in Figure 9.39, with mesh spacing Ax = Ay = 1. Equation (9.58) 
just gives u(—1, j) = u(1, j) for each 7. Equation (9.55) is modified to take into account 
the right-hand side of the Poisson equation. The value Ax? is added to the right-hand 
side of (9.55) for each of the interior points. We can write the equations as 


4u(0, 1) 2 u(-1, 1) - u(1, 1) + 0 + u(0, 2) = 2u(1, 1) + u(0, 2) 
4u(1, 1) = u(0, 1) + ш(2, 1) +0 + (1, 2) + 2 
4uQ, 1) 2 u(1, 1) + и(3, 1) +0 + (2, 2) + 2 


Figure 9.39 The mesh 
used in Example 9.36. 


Figure 9.40 Data 
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i=0 


i=l 


i22 iz3 


and so on, and hence obtain 12 equations in 12 unknowns. These can be solved by any 
convenient method to give a solution as shown in Figure 9.40. 





from the solution of j=4 0.0000 0.2500 0.5000 
Example 9.36 using a =з 0.2050 0.3021 0.4278 
step length 0.25ineach / > 2 0.2160 0.2630 — 0.3137 
direction. ј= 1 0.1329 0.1578 0.1727 

ј= 0 0.0000 0.0000 0.0000 


0.7500 1.0000 
0.5329 0.5625 
0.3290 0.2500 
0.1567 0.0625 
0.0000 0.0000 








9.5.4 Ехегсіѕеѕ 


55 Use the five-point difference approximation in 
S (9.55) to solve 
2 2 
биби (0<x<1,0<y<1) 
Ox oy 


56 


57 


where u(x, 0) = u(0, y) 2 0, u(x, 1) 2 x, u(l, y) 
= у(2 - y). Find the approximations for u(1, 1) for 
grid sizes Ax = Ay = į and Ax = Ay =}. 
Use a mesh Ax = Ay =} and Ax = Ay = І іо 
solve 
2 2 

ди X gu -0 

Ox ду 
satisfying u(0, y) 2 1, du(1, y)/àx = 0, u(x, 0) = 0, 
u(x, 1) 2 1. 


(0<х<1,0<у<1) 


A numerical solution is to be determined for the 
loading of a uniform plate, where the displacement 
w satisfies the equation 
2 2 
2w + en +20=0 
дх ду 


58 


and a square mesh of side / is used. Show that, at 
an interior point 0 with neighbours 1, 2, 3 and 4, the 
approximation to the equation is 


4w = wi + w, + Ws + Wy + 2017 


The plate is in the shape of a trapezium whose 
vertices can be represented by the points (0, 0), 

(5, 0), (2, 3) and (0, 3). The plate is held on its edges 
so that on the boundary w = 0. Compute the solution 
for w at the five interior points if / is taken as 1. 


The function (x, y) satisfies the equation 


2 2 
Qe 06... 
oy ду 


and the boundary conditions (see Figure 9.41) 


$23-y! onOA(x20,0xyxl) 
9224 оп АВ (у= 1,0 <х < 1) 
y: 

ф=1 оп ВС (х= 1,0 <у« 1) 


ф= 3-х оһСО (у= 0,0 <х < 1) 
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59 


X 
A B 
c 
о x 


Figure 9.41 Region for Exercise 58. 


Solve the equation numerically, using a mesh of 


(а) Л = 1 in each direction, (b) A = + in each direction. 


2 4 


The function (x, y) satisfies the Laplace equation 


2 2 
“0,90, 
Ox ду 


inside the region shown in Figure 9.42. The 
function ó takes the value $ — 9x? at all points on the 
boundary. Making full use of symmetry, formulate 


EMEN Finite elements 


In Section 9.5.3 we sought numerical solutions of the Laplace equation, but we noted 
that only simple geometries could be handled using finite differences. The region 
described in Exercise 57 1s about as difficult as can be treated easily. To adapt methods 
to awkward regions is not easy, so alternative strategies have been sought. Great 
advances took place in the 1960s, when civil engineers pioneered the method of finite 
elements. To solve plate bending problems, they solved the appropriate equations for 
small patches and then ‘stitched’ the latter together to form an overall solution. The job 
is not an easy one, and requires a large amount of arithmetic. It was only when large, fast 
computers became available that the method was viable. This method is now very widely 
used, and forms the basis of most calculations in stress analysis and for many fluid 
flows. It is very adaptable and physically satisfying, but is very difficult to program. 
This is in contrast to finite differences, which are reasonably easy. In general the advice 
to anyone employing this technique is to use a finite-element ‘package’, available in 
most computer libraries, and not to write one’s own program. For instance in the Partial 
Differential Equation Toolbox in MATLAB, finite elements is the standard method of 
solution even for solutions in a rectangular region. As with many toolboxes of this type 
they need a lot of work to master all the details. It is important, however, to understand 
the basis of the method. We shall illustrate this method for a simple situation, but refer 
to specialist books for details and extensions: for example, see P. E. Lewis and J. P. Ward 
The Finite Element Method (Addison-Wesley, Reading, MA, 1991). 
We consider solutions of the Poisson equation 


2 2 
Ate pe) 





-l 


Figure 9.42 Region for Exercise 59. 


a set of finite-difference equations to solve for the 
nodal values of ф оп а square grid of side A — +. 
Solve for $ at the nodal points. 


(9.59) 


Figure 9.43 
Triangular finite- 
element mesh, with the 
local numbering of a 
typical element. 


Figure 9.44 

(a) The function Lo; 
(b) u approximated 
as a linear function 
in the element. 
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in a region R with u given on the boundary of R. The region R is divided up into a tri- 
angular mesh as in Figure 9.43. We aim to calculate the value of u at the nodal points 
of the mesh, but with the function suitably interpolated in each triangle. The simplest 
situation is obtained if, in a typical triangle, u is approximated as a linear function 


u=ax+by+c (9.60) 


taking the values u,, u, and u; at the corners. This function can be written explicitly in 
terms of the functions 


ч 
ю 
Md 
N 

ji 

ч 
S 
Ма 
N 

= 


Lj-|x yo Vf ix. yo | (9.61) 


Хо Yo 


— 
ч 
© 
Se 
© 
= 


x y 1 X; y, 1l 
L;—-|Xo yo 1 Xo yo 1 
xy y 1 x) y 1 














each of which is taken to be zero outside the triangle with vertices (xo, yo), (xi, y4) and 
(x5, y;). The denominators are just 24, where A is the area of the triangle. The function Lo, 
illustrated in Figure 9.44(a), takes the values 1 at (xo, yo), 0 at (x;, yı) and (x5, y;); it is 
linear in the triangle and is taken to be zero elsewhere. The functions L, and L, behave 
similarly. The field variable u in the element, denoted by u°, can now be written as 


их, у) = ш + и + uL, (9.62) 
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Example 9.37 


Solution 


The situation is illustrated in Figure 9.44(b). Note that u°(x, y) is a linear function in x 
and y, u*(xo, yo) = Up, U(X), V1) = u, and u(x, Y2) = u, and hence gives an explicit form 
for the function in (9.60) that has the correct values at the nodes. 


Find the linear function that has the values uo at (0, 0), u, at (1, 1) and u, at ( i, 1). 


From (9.61), the functions Zy, L, and L, are given by 


x y 1|]|0 0 1 
„=|1 1 1]! 1 1|=@@-1)%5=1-у 
i: 4 agn 34 
x 1 
L =|; 1 1 1=2х- у 
0 0 1 
x y 1 
L,2|0 0 1 = 2у – 2х 
1 1 1 








Thus, from (9.62), the required linear function is 
u = (1 — у)и + (2х – у)и + 2(у = х)и, 
Or 


и = x(2u, — 2u;) * y(—ug — u, * 2u;) + иу 


We build up the solution of (9.59) as the sum over all the elements of the functions 
constructed to be linear in an element and zero outside the element. Thus 


и = x и 
To be of use, this function must satisfy (9.59) in some approximate sense. The 
function cannot be differentiated across the element boundaries, since it has dis- 
continuous behaviour. We therefore have to satisfy the equation in an integrated or 
‘weak’ form. 
We use the well-known result that if V is continuous and 


[ = 0 


R 


(9.63) 


for a complete set of functions $ (that 1s, a set of functions that will approximate any 
continuous function as accurately as desired) then V = 0 in R. Using the residual of 
(9.59) in (9.63) gives 
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Figure 9.45 

(a) A typical node 
and its neighbours 
in the region R; 
(b) the pyramid 
function used at 
the typical node. 


DfA, fy ude идә larg 
| (6 )* $662) Quick сч p» А 


= -[| ee + u$, + рф)ахду — | dx — gu, dy) 


C 


where the final integral is obtained over the boundary C of R using Green's theorem 


aN ƏM А 
ja Е A. Jar» = | м + Му) 


R C 


described in Section 3.4.5. 
Choosing 6 to be zero on the boundary C, we have an integrated form for (9.59) as 


0= ||. + и,ф, + po) dx dy (9.64) 


R 


It is this integrated or ‘weak’ form of (9.59) that we shall satisfy. We are therefore 
satisfying the equation in a global sense over the whole region. In comparison, finite- 
difference approximations are local to the mesh point. Clearly, we cannot use (9.64) 
with a complete set of functions Фф, since there must be infinitely many of them. The 
best that can be done is to use N test functions @, (n = 1,..., N) when there are N 
interior nodes. There will then be N equations for the N unknowns u, (n 2 1,..., N) at 
the node points. As N — ee, the functions $, must form a complete set, and then the 
weak form of (9.64) will be satisfied identically rather than approximately. The most 
popular set of functions @; is that due to Galerkin, who used the pyramid functions 
illustrated in Figure 9.45. At a typical point 0 with neighbours 1, 2, 3,..., m we have 


1 atnode0 | and piecewise-linear in each 
ф= 10 atnodes 1,2, ...,m 


identically zero outside the neighbouring triangles 


of the neighbouring triangles (9.65) 





(a) (b) 
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If there are N nodes in the mesh then there are N pyramid functions of the type (9.65). 
We substitute each of these functions in turn into (9.64) to satisfy the weak form of our 
original Poisson equation. Taking a typical node, we see that @ is piecewise-linear in the 
neighbouring triangles. For a typical such triangle 012, @ is just the linear function L, 
defined earlier. We substitute  — L, into the right-hand side of (9.64) and use the fact that 


u - ugLo t u4L, * и,» 


to obtain the contribution from this particular triangle as 


- (|l BE + St) 
L= ez tu Ox j “2 Ox 
Ао? 


aL oL OL,\ OL 
+ @ + NS + uoa 4 n dy 
where p, is taken to be constant in the triangle. Since L; (i = 0, 1, 2) are linear, 0L,/Ox 
апа дЇ„,/ду ате constants, and hence the integrals can be performed explicitly, giving 


1. = {шу - у) + (х = xy] + Uy[(¥2 — Vo Vi 7 »3) * 66 7 x)6 - x;)] 
* uj[Cyo — yü Gi = V2) + Ho = XO = x:)])/44 + Ар. 


From (9.64) with @ chosen as (9.65), we obtain for the point 0 the sum of such terms 
over neighbouring elements: 


У1=0 


This is just an equation of the form 


m 
У аш + Б = 0 
i=0 


where the coefficients a; and b depend only on the geometry and not on the field 
variables. 

A similar computation is performed for each internal node, with @ taken to be of the 
form (9.65). The Poisson equation is linear in the u;, and since there is one such equation 
for each internal node, we obtain N equations in the N unknowns u; (i= 1, ... , N). These 
form a matrix (called the stiffness matrix) equation, which can be solved for the u;. The 
general strategy is 


(i) calculate all the coefficients; 

(ii) assemble the stiffness matrix; 

(iii) invert the matrix to obtain the unknowns u; (i = 1, 2, 3,..., №); 
(iv) calculate any required data from the solution. 


For the Laplace equation with linear approximating functions in triangular elements, 
a MATLAB implementation will be developed. In this development, note the com- 
plexity of the programming even for this very simple situation and also the organi- 
zation of the input data required by the program. There has been no attempt at 
efficiency in the programming of the sections of the code. 


Example 9.38 


Figure 9.46 

Mesh for the finite- 
element solution to 
Example 9.38. 


Solution 
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Solve the Laplace equation 07u/dx? + o^u/dy? — 0 in the unit square shown in Figure 9.46, 
subject to the boundary conditions indicated. 





Here we have a simple rectangular region with p = 0, and which is the same problem 
as Example 9.34. 


A012 gives a contribution — 0.625u, — 0.375u, — 0.250u; 


А023 " 0.5001, – 0.250и, — 0.250, 
А034 " 0.625u, — 0.2501, — 0.375u, 
А045 " 0.625u, — 0.375u, – 0.250и; 
A056 " 0.5001, — 0.250u5 — 0.2501, 
A061 " 0.625u, — 0.250u, — 0.375u, 


Adding all these contributions for the point 0, which is the only unspecified point, gives 
0 — 3.5u, — 0.75u, — 0.50u, — 0.500, — 0.75u, — 0.50u, — 0.50u, 

so that, knowing u, = u, = U3 = Us = Us = 0 and u, = 1, we obtain u, 2 0.2143. Comparing 

with Example 9.34, we see that our result is not particularly accurate, which is not 

surprising, since the mesh chosen here is particularly crude. 





It is clear from Example 9.38 that the contributions from each triangle need con- 
siderable computational effort and the finite-element method is unsuitable for hand 
computations. 


The coefficients in Example 9.38 can be computed from the MATLAB M-file stored 
under the name coeff:m. The coordinates of the vertices of the triangle are inserted as 
a=[p q],b=[r,s] and c=[u,v]. The coefficients are produced in a0, al,a2. 


function[a0,al,a2]=coeff (a,b,c) 

tA PNE ЕЕ Оа ОЕ = TL s n DAE CD COR M x DO TN ОЕ 
den=0.5/det ([A;B;C]); 

кае Messa e IS eR TBI ads IC T 

ЕЕ fero MIN ess CAD) dett iiy CrAl] y 

шок ше A eNe AE 

айг КЕ denm каше туеп Жал EE m*cemn 





Such a function file will be used in a more general program. 
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Example 9.39 


Figure 9.47 Mesh for 
Example 9.39. The 
unmarked boundary 
points are given as 


и= 0. 





Solution 


Solve the Laplace equation in the region shown in Figure 9.47 subject to zero boundary 
conditions except for the three points indicated. All the triangles are equilateral of side a. 


u=1/2 


„уу, 
муу 
АТАА 





u= 1/2 


Note that in the region in Figure 9.47 it would be very difficult to implement a standard 
finite-difference mesh. We now follow the general strategy. 


(i) Calculate the coefficients. When all the triangles are equilateral, the coefficients are 
all identical so the amount of computation is greatly reduced. For a typical point 


A012 gives a contribution (4и, – 27, – 2и,)/4{/3 
A023 А (4и, — 2и, – 2и;)/4 {3 
A034 i (4и, = 2и; – 2и,)/4{3 


and hence adding the six contributions gives the total for the typical point 
(Guo — u, — uy — u4— u4 — us — ug) 3 
(п) Assemble the stiffness matrix. Apply the results in (i) to each of the six active points 
6u, =u; + шщ +и,+0+ +1 
6u, =u, +u; +0+0+0+u 
6u; = 0+ Ug + Ug + Uy, + 5 +1 
6u, — ug 0 t us uw, t uus 
6u; =0 + 0 +0 +0 +u, + u, 
6u =0+0+0 +u, +u +0 


and the matrices take the form 


6 -1 -1 -1 о 0 3/2 uy 

-1 6 0 -1 -l 0 0 и» 

A- -1 0 6 -1 0 -1 b= 3/2 u= U3 
-] -1 -1 6 -1 -1 0 ua 

0 -1 0 -1 6 0 0 Us 

0 0 -1 -1 0 6 0 Us 


stiffness matrix load vector unknowns 


E] 
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(ii) We now need to solve the matrix equation Au = b for the vector u. It may be 
noted that the matrix does not have much ‘structure’, except that it is diagonally 
dominant, so a direct inversion is usually preferred unless the dimension of the 
matrix is very large. The equations were solved using MATLAB to give 


0.3481 
0.0900 
0.3471 
0.1514 
0.0402 
0.0831 


(iv) Calculate any required data from the solution in (111). 


To construct the rows of the stiffness matrix, A, ina MATLAB implementation, the 
coordinates of the nodes are put into a matrix 


Xi Yi 
Со = |222 


Xn Yn 


with the internal nodes first followed by the boundary nodes (note in the MATLAB 
program this matrix is declared as global). The neighbours of each of the internal 
nodes are placed in the matrix /ink, one row for each node. The MATLAB function, 
stored in the M-file stiff.m, computes the contribution to the stiffness matrix from 
the Ath internal point, the output a gives the contribution to the kth row in the full 
stiffness matrix. 


function a=stiff(mm,k,L) 
$mm-no of neighbours, k-current point, L-row of k's 
neighbours 
global CO 
a-zeros(1,mm-«1); 
Eor p= kimm dt 
[Шул et r ACON COTE CONE odi PE RE 
$ note that coeff.m is used 
а(1)=а(1) +1; а(р+1)=а(р+1) +7; а(р+2) =а(р+2) +0; 
епа 
Г аа Еа 8) GO тото ЕО , 3) )s 
a(1)=a(1)+1; a(mm+1)=a(mm+1)+m; a(2)=a(2)+4+n; 


The following example illustrates the use of MATLAB in the solution of the Laplace 
equation. 
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Example 9.40 Solve the Laplace equation V? = 0 in the elliptical region 
2 2 
i 5 + ——- = 1 
cosh]  sinh'l 
with @= 1 on the upper half of the ellipse and @ = 0 on the lower half. The situation is 
illustrated in Figure 9.48. 























Figure 9.48 Mesh for 1.5 T T T T T T T 
the elliptical plate in 
Example 9.40. 
lr 4 
0.5 4 
y OF 4 
—-0.5 Fr 4 
а = 4 
1.5 1 1 1 1 І 1 І 
-2 —1.5 =1 —0.5 0 0.5 1 L5 2 


Solution The problem corresponds physically to an elliptical plate, hot on one side and cold on 
the other. In Figure 9.48, the triangulated mesh illustrated is the one used in the program. 
Note how well the mesh fits the boundary even for a small number of boundary nodes. 

Data for the problem is placed in a script file stored as inform.m. 


(m nin=5; nbdry=12; snumber or internalland Boundary nodes 
global CO 
WHO Sion /HS S91 /6e ЕСО СО тала саа (Е 
EXTR RR EX EMEN HOMER ME NAE 


COS0 Og—sal (4, iL) ТТТ ОТР Т ЕВ) 
Ns КЕ 

$coords of points, internal first then bdry 
ul stray ST TU UNE EO ENTE TC) TN a Sp ТО ТИЕТ NES 
io iy $6 5s 1 4 G vy S 9yg 

$links from interior points to neighbours, in CO order 
bdry=[0.5000000.511111]; $boundary values, in CO order 
Jemen mni) p semester (iniit, 1.) ¢ 


The solution is then computed from the following code, which should be stored in some 
convenient place so that it can be edited easily. The complete stiffness matrix, A, is 
assembled and all the boundary data is transferred to the right-hand side vector called rhs. 


Example 9.41 
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inform %inserts the data from inform.m 
for k-1:nin %for each internal node transfers info to A or 
rhs 
а АЕ тато та) я 
2 S Sie a E NT 
$uses stiff.m, which in turn uses coeff.m, to compute the 
contributions from row k 
EA (EIOS ES A ШЕЕ) КЕ (ӨШ); 
ОЛ SI 
ШЕ Ж = 
(лп) Дд (к шк (пек Шү; 
else 
rhs(k)-rhs(k)-z(i-1)*bdry(r(i)-nin); 
end 
end 
end 
A, rhs, ANrhs %prints out the stiffness matrix, the rhs 
and the final solution 


The print-out is 


A= 4.3857 -1.1266 -1.1266 -1.1266 -1.1266 rhs- -0.0604 
-1.1266 3.9085. -0.6836 0 0 0.0699 
-1.1266 -0.6836 3.9085 0 0 2.0284 
-1.1266 0 0 3.9085 -0.6836 2.0284 
-1.1266 0 0 -0.6836 349085 0.0699 


which gives the final solution as 
0.5000 0.2868 0.7132 0.7132 0.2868 


The solution has all the correct symmetries about the x and y axes. An exact solution to 
the problem can be obtained by separation of variables in an appropriate coordinate 
system in terms of Fourier series. For node 3 this method gives the value 0.7076 com- 
pared with the FE value of 0.7132; an accuracy of less than 1% has been obtained. 


For the solution of the Poisson equation with p # 0 in (9.59) all the segments of the 
MATLAB programs need to be modified. A similar problem in a rectangular region 
was studied in Example 9.36 using finite differences. 


Solve the Poisson equation 
2 2 
дх oy 
in the hexagonal region illustrated in Figure 9.49 and with u = 0 on the boundary of 
the region. 


-2 
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Figure 9.49 
Hexagonal region 
with mesh used in 
Example 9.41. 


Figure 9.50 

(a) Flow in 
hexagonal pipe; 
(b) cylinder in 
torsion. 


Solution 














ee. gripped 
> ri 
ЗА. ибх, y) T 
(a) 
al e gripped and 
I twisted 





(b) 


There are several physical interpretations of this problem. In the context of heat flow, the 
boundary is kept at a fixed temperature and heat supplied at a uniform rate to the plate. 
For the unidirectional flow of a viscous fluid in a long hexagonal pipe, u is the velocity 
and the constant right-hand side is related to the pressure gradient along the tube, see 
Figure 9.50(a). A honeycomb of tubes in a heat exchanger is a possible application. 

When a cylinder is in torsion, by gripping at one end and gripping and twisting at 
the other, the stresses can be computed from the same Poisson equation; for an illus- 
tration see Figure 9.50(b). For a detailed description of these physical problems and the 
derivations a specialist book should be consulted (S. C. Hunter, Mechanics of Con- 
tinuous Media, Ellis Horwood, Chichester, 1983). 

The modifications to the MATLAB implementations can be checked easily against 
the same problem with an elliptical region since an exact solution is known to be 


242 2 2 
ф=-@ b (: ОЕ £) 
a +b? аг Б 
for the region 


2 


+= =1 


S fs. 


&,[* 
N 
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The stiffness matrix is the same as in Example 9.40 and only the right-hand sides are modified. 
For the hexagonal region with the mesh shown in Figure 9.49 the matrices are com- 
puted using the modified MATLAB segments and the print-out is 


A= 3.4641 -0.5774 -0.5774 -0.5774 -0.5774 -0.5774 -0.5774 


-0.5774 3.4641 -0.5774 0 0 0 -0.5774 
-0.5774 -0.5774 3.4641 -0.5774 0 0 0 
-0.5774 0 -0.5774 3.4641 -0.5774 0 0 
-0.5774 0 0 -0.5774 3.4641 -0.5774 0 
-0.5774 0 0 0 -0.5774 3.4641 -0.5774 
-0.5774 -0.5774 0 0 0 -0.5774 3.4641 


rhs? = 0.4330 0.4330 0.4330 0.4330 0.4330 0.4330 0.4330 
and gives the solution 
0.4167 0.2917 0.2917 0.2917 0.2917 0.2917 0.2917 


The problem has a great deal of symmetry which is reflected in the solution and, 
indeed, symmetries could have been built into the program to reduce the computational 
effort. In the current situation the solution is computed almost instantly, but in most 
engineering problems every bit of symmetry should be used to its fullest. 


In this section no more than the ‘flavour’ of the finite-element method has been given. 
The intimate connection with computers makes it difficult to do more than show the 
complications that occur and give an outline of how they are dealt with. 

For many problems a linear approximation is not good enough, for example in stress 
analysis for finite deformations, when high derivatives are required. Also, there is no 
good reason why a triangle is chosen rather than a quadrilateral. There are several 
choices that must be made at the start of the calculation: 


(i) division of the region into triangles, quadrilaterals, . . . ; 
(ii) level of approximation, linear, quadratics, cubics, . .. ; 
(11) choice of test function; 

(iv) method of integration over elements: exact, Gaussian, ... ; 
(v) method of labelling nodes; 

(vi) method of solution of the resulting matrix equation. 


As indicated in (iv), once we have abandoned linear approximations, the integrals 
cannot be performed exactly, and we need to use an approximate method. Gaussian 
integration for triangles works very well, and is commonly used. Usually, there is no 
obvious labelling of nodes and it is necessary for each node to keep a list of which 
nodes are neighbours, as in the matrix link in the MATLAB segment inform.m. In the 
Partial Differential Equation Toolbox in MATLAB an automatic triangulation of a region 
can be found; it automatically chooses the nodes and their coordinates, the labelling, 
as in (v), and the list of neighbours. The complexity can be appreciated for a simple 
rectangular mesh from the MATLAB commands 

g-'squareg'; 

[p,e,t]=poimesh(g,4); $ p,e,t represent the points, edges, 
triangles respectively 

pdemesh (p,e,t) 
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60 


61 


Because the labelling is not straightforward, the resulting stiffness matrix rarely has a 
simple structure, and the most usual method of inversion is by a full frontal attack 


with Gaussian elimination. 


The method has been illustrated for only one equation, the Poisson equation. A similar 
analysis has to be undertaken for each new equation considered. 


9.6.1 Exercises 


All these exercises require substantial programming expertise. Alternatively the Partial Differential Equations 


Toolbox in MATLAB can be used. 


Solve the Laplace equation for the rectangular 
region 0 « x « 6 and 0 x y x 243 using 

finite elements. On the right hand boundary u = 1 
and zero on the remainder of the boundary. 


(a) 
3.5 


3 
2.5 


2 
y 





1.5 


1 


0.5 











(a) Use the mesh in Figure 9.51(a) with two 
interior points. 

(b) Use the mesh in Figure 9.51(b) with five 
interior points. 























Figure 9.51 Mesh for Exercise 60: (a) with two interior points; (b) with five interior points. 


Solve the problem in Exercise 57 using the 
triangular finite-element mesh shown in 
Figure 9.52. 


2:3 
(0,3) (2, 3) 


(0, 0) (5, 0) 


Figure 9.52 Finite-element mesh for 
Exercise 61. 


Solve the problem in Exercise 59 using the triangular 
finite-element mesh shown in Figure 9.53. 





Figure 9.53 Finite-element mesh for Exercise 62. 


9.7.1 


Example 9.42 


Solution 
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Integral solutions 


In previous sections solutions were built up from elementary solutions of various partial 
differential equations, the most obvious of which was from separated solutions. There 
are many problems where separated solutions are not available but building up from 
elementary solutions may still be possible. This section will show methods of solution 
which can lead to very important ideas that can be exploited practically and importantly 
to some proofs of existence and uniqueness of solutions. Some numerical methods also 
use these ideas, for instance the boundary element method is an extension of finite 
elements with the advantage that the dimension of the calculations is reduced by one. 
On the whole the mathematics is quite demanding so the interested reader is left to 
explore the full power of the new methods in specialist books. 


Separated solutions 
In plane polar coordinates with x 2 r cos 0, y 2 r sin 0, the Laplace equation 
2 2 
кр 
Ox ду 
becomes (see Example 3.6 in Section 3.1.1) 
1 2( а) 102) 
‚Ак po s 9.66 
rar’ or rE (9.66) 
Writing u = F(r)G(0) and substituting gives the separated equations 
180,48) МС р 
F dr\ dr Gag 


or 


d/ dF 2 ФС 2 
S[,Q"|- gp а ы 26-0 
rir Jj n an i u 


For periodic solutions of the equation in G the constant u must be an integer n. Thus 
the solution is 


G = A cos nO + B sin nO 
and for the F equation it follows easily that 
D 


n 


r 


F=Cr + 


where A, B, C, D are arbitrary constants. We now choose a specific problem to illustrate 
the use of these solutions. 


Solve the Laplace equation V’u = 0 inside the circle r = a with и specified on the bound- 
ary as u(a, Ө) = f(0), a continuous periodic function with period 2m. 


If the solution is finite inside the circle then F must be of the form F = Cr”. A sum of 
all the terms then becomes 
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u(r, 6) = 24, + У i" (A, cos n8 B, sinh n6) (9.67) 


n-l 


and the solution is now a standard Fourier series problem (see Chapter 7) namely 


u(a, 9 - f( = 14 * V a'(4, cos n -B, sin n0) 


The coefficients are 


2n 2n 


vA - i | f(f) cos(nt) dt and B,=+ | f(t) sin(nt) dt 
0 0 


These coefficients are substituted into (9.67) and the summation and integration 
are interchanged; this is permissible since f(@) satisfies the Dirichlet conditions (see 
Section 7.2.9) 


2n 


u(r, 0) — f mi о ji (cos n0 cos nt 4 sin nO n dt 


0 
or 


2n 


u(r, ө-1) LE t J E (9.68) 


Now consider the series 


err bg s. where z= Re” 


which has the sum, for |z| < 1, 


1 z | 1+27 


2 1-2 2(1-z) 








Take the real part and we obtain, after a little algebra 
»1-R 
 2ü -2R cos @ +R’ ) 


Use this expression in (9.68) to obtain a final result 


Z+ R cos ф+ К соѕ2ф+.. 


2n 


u(r, 9)— L| лос 


0 


2 
=r 


реа (9.69) 
a^ - 2ra cos(@ - f) +r 


which is called the Poisson integral formula. 


Example 9.42 shows that the solution of a complicated partial differential 
equation can be reduced to an integral. The problem has been reduced essentially from 


9.7.2 
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a two-dimensional problem to a one-dimensional problem on the boundary. Integrals 
are well understood and there is a vast array of methods that can be used to solve the 
problem either explicitly or numerically. In general if a method can be developed to 
convert a partial differential equation to an integral form on the boundary then a great 
deal of progress has been made to obtaining a solution. 

Not least of the advantages of an integral formulation is that bounds on integrals are 
much easier to obtain than on differentials. These are used almost exclusively to prove 
uniqueness and existence of solutions. A result that follows easily from (9.69) is 
obtained by putting r = 0 

2n 

u(0, = | f(t) dt (9.70) 
2n 

0 
so that the value at the centre of the circle is the average of the values on the bounding 
circle. Thus the value at the centre of the circle can never be the maximum (or minimum) 
value in the region. For a general Laplace equation problem, since every interior point can 
be placed at the centre of a small circle, it can never be the maximum. We can therefore 
deduce that for the Laplace equation the maximum (or minimum) value cannot be in 
the interior but must be on the boundary. This result is one of the keystones of the proof 
of uniqueness of solution. 


Use of singular solutions 


Again consider the two-dimensional Laplace equation (9.66) in plane polar coordinates. 
If we look for solutions, f(r), that only depend on r, then the equation becomes 


14680. 
rdr\ dr 


which can be integrated as 
? and then f=Alnr+B 
r 


where A, B are arbitrary constants. The solution has a singularity at the origin which 
can be exploited to obtain more general solutions and reduce the problem again to an 
integral round the boundary, as in Section 9.7.1. The method is based on Green's the- 
orem discussed in Section 3.4.5 


fears Qdy) = || (2g = = dx dy (9.71) 


C S 
where the curve C encloses the region S. Put 
Qv , Qu Qv du 
P--uc tv— d -u—-y 
7 "i and QO s vo 
into (9.71) to give 
} и (Say - ан) -v (Stay - vax) = || (ио - о?и) dxdy 


C 5 
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Figure 9.54 Region S 
bounded by the curve 

C, ‘punctured’ by the 

small circle C’. 





and finally 


c : 09) - || (uV?v - vV?u) dxdy (9.72) 
дп дп 
S 


С 


where 7 15 the normal direction and s is the length parameter along C. 

We now need to choose u and v in the extended Green's theorem (9.72) to obtain 
useful results. The region considered is the interior of S, which is bounded by the curve 
C, ‘punctured’ by a small circle C’ with centre rg and radius € (which we will eventually 
tend to zero), r= ry + E(cos @, sin $) (see Figure 9.54). Take 


1 
2———]nlr— 
u on n|r -rol 


and consider the second term in (9.72) on the small circle C’ 


ди du 1293 11 

— = -— = —-—] Tree d ds = d 

on’ дє 2тдє d 271€ E edy 
Thus the term becomes 


2n 2n 


2u du] pledo =} = 
pear | vedo z| v dó — v(rj) 
0 0 


c^ 
The last result just comes from (9.70) and we see that this term just picks out the 
value of v at the point rọ. It remains to construct appropriate u and v to exploit this 
idea. It is left to specialist texts to consider the choices for general problems; here we 
will concentrate on the Dirichlet problem (Section 9.8.2) for the Poisson equation, 
where 


V?’y= —p(x, y) in the region S and y given on C 
and let 


l 


5, Inlr - rl t HG у; хь у) (9.73) 
T 


W = GR, Y; Xo Yo) =- 
so that 
Уу = 0) in the region S and y’ = 0 on C 


and H has no singularities in the region. In (9.72) put u = yw’ and v = y. Because y^ 
satisfies the Laplace equation and y the Poisson equation in the region S we can write 
(9.72) as 


Example 9.43 


Solution 
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QU — 2) E ‚ду _ 2(-1 _ ) 
kv Эп v; ds fv 2r V. nr rj t H(r, r9) |£ dé 
C c’ 


aL 
$ 


Taking the left-hand side terms one by one: the first term is zero since y’ = 0 from the 
conditions given; the second gives the required integral round the boundary; the third 
term is of order £ In £ so tends to zero as € > 0; the fourth term was treated above and 
gives (ro); and the fifth term is of order = апа tends to zero. Collecting up the terms 
gives 


V (Xo, Yo) ^ — pues y) 2 G(x, Y; Xo, Yo) ds + | | G(X, y; Xo, yg)pGo y) dx dy (9.74) 
С. 


We can now find y at any point in the region from the value of y on the boundary, the 
right-hand side of the Poisson equation p(x, y), together with the function G(x, V; Xo, Yo)» 
called Green’s function. At the moment it is assumed that Green’s function exists and 
can be calculated. For simple geometries it can often be found and advanced books 
show how and when this can be done. The whole theory of Green's functions can be 
applied to many different equations and boundary conditions but this is the province 
of advanced books on partial differential equations (see R. Haberman, Applied Differ- 
ential Equations, Prentice Hall, Upper Saddle River, NJ, 2003). An example will illus- 
trate the method. 


Solve the Laplace equation 
У7= 0 in the region y > 0 


given that f(x, 0) 2 F(x), a known function, on the x-axis and that fis zero at infinity. 


Green’s function (9.73) can be constructed by reflection as 
1 1 
G[(, ¥3 Xo, o)] = -== In| (& —3 y — Yo) | + == In| Œ — Xo, Y + Yo) | 
2n 2n 


= ано 
AT [(x -x9) (y yo) 


Note that the added term has no singularities in the region y > 0; the function is zero 
оп у = 0 and tends to zero as x and y tend to infinity. Now 


eG 9G = Л. 2(y - yo) Е 2(y +уо) 
Әп ду Ап |(х- хо)? +(у- уо)" (х- х0) tQ ty) 


Putting y = 0 gives 
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9.7.3 


9G . yo 1 
On T |(x-x)) yo 


The solution is then obtained from (9.74) as 


i Yo l1 
(хо, уо) = [ro x z В x si dx 


Again we see that the solution has been reduced to an integral along the boundary. 
Finding the exact form of Green's function is known for some classic problems, like 
the one in Example 9.43, but in general it is a difficult calculation. The problem is 
closely connected to finding a solution of the partial differential equation with zero 
boundary values (at least for the Dirichlet problem) and a Dirac delta function imposed 
at a point in the interior. 


Sources and sinks for the heat conduction equation 


In Example 9.4 a solution to the one-dimensional heat-conduction equation was 
obtained which corresponded to a pulse of heat being supplied at a particular point at zero 
time. The subsequent dissipation of the heat pulse is illustrated in Figure 9.4. A similar 
solution can be obtained for the three-dimensional problem. The radial symmetric equation 
of the heat-conduction equation is 


oT 1 2( 20T ) 
— —K—cpr— 9.75 

Ot or "Or Sum 
where r is the radial distance from the origin and «K = K/( pc) is the thermal diffusivity. 
The parameter & is the thermal conductivity, p is the density of the medium and c is the 
specific heat capacity. The solution corresponding to the one-dimensional solution of 
Example 9.4 is 


_ Орсу (_ 
кй" ew( iu] (9.76) 


The solution can be verified by direct substitution, as in Example 9.4. It is noted that 
there is a singularity at zero time which corresponds to a point source releasing an 
instantaneous amount of heat Q, calculated as follows: 

The total amount of heat in the whole of the space at time ¢ is computed from the 
amount of heat in the shell of radius r and thickness dr and then integrating 


d des 2 Q r 
H= [^ *(pcT)dr = [eres (o 
0 0 


The integration can be obtained by differentiating with respect to a the well known integral 


оо оо 


2 2 
e" dz- Te toget | 22е ағ = s 
2Va 4 о? 


0 0 
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Thus 


H2 ——Q aep? 2g 
2\n(kty 4 

The solution (9.76) is usually called an instantaneous source, releasing an amount 

of heat Q at time zero. As expected for fixed ¢ and as r tends to infinity the temperature 

T tends to its resting temperature of zero. Also for fixed r the temperature tends to zero 

as the time f tends to infinity and all the heat is conducted away. 

For a continuous source, heat 1s released continuously at some given rate. In a time 
interval ds at time s, assume that heat released is q(s)ds then the temperature 1s given 
by (9.76) with the time starting at s. Temperature due to release of heat q(s)ds at time 
s is given by 


—4GM(pe) у (2) pes 
8(mK(t-s))” 4K(t- s) 


Thus over the whole interval then 


_ 1 4(5) r 
Tir t) =m -—————  |d 9.77 
n 8pc(nx)" | ao. ехр( m) 7 am 


Again we see that the solution of the heat-conduction equation can be written as an 
integral with all the advantages listed earlier. For most functions q(s), the integral in 
(9.77) cannot be performed explicitly, but for g = go, a constant, it is possible. Making 
the substitution 


и = rí[Ak(t — s)]'? 


reduces the integral to 


оо 








ту s ехр(-и2) 2(4к)!14и 
8рс(лк) r 
rl) (Akt) 
zo. d iyu e 1 le ( 1.) 9.78 
pem | exp(u )du 7 ск (Ar) к 
rl (4xt) 


where erfc is the known function defined in Example 9.24 and can be found in all 
computer packages. 

It can be seen that for large times the erfc function tends to one so the steady tem- 
perature due to the source decays like 1/r and the steady temperature T, is 


(г) ——- (9.79) 


This function must satisfy the Laplace equation; except at the origin, and gives the 
three-dimensional singular solution that is used to construct Green’s function in an 
exactly similar manner to the two-dimensional version described in Section 9.7.2. 

Many situations can be tackled using (9.76)—(9.79) and the solution can be reduced 
to an integral, which usually requires a numerical quadrature. 
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Example 9.44 . Find the steady temperature due a constant line source of length 2L, placed in an infinite 
conducting medium with constant thermal properties. 


Solution We will use the steady point solution in (9.79). The axes are set up in Figure 9.55, with 
(R, z) being the cylindrical polar coordinates of the field point relative to the origin at 
the centre of the rod — note that there is no angular coordinate since the solution is 
clearly symmetrical about the z axis. 


Figure 9.55 (a) 
(a) Line source 

and (b) cylindrical 
coordinates (R, z) for 
Example 9.44. 








(b) z 
L (R,z) 
z-p 
R 
p 
>К 


Take an element of the line at (p, 0) of length dp releasing heat at a rate of q;dp then 
from (9.79) the temperature at (R, z) due to this element is 
41 1 
—— ———— dp 
4рскт үк? +(2- р) 
Hence the temperature due the whole of the line is 


È 


41 1 
T,(R, z) = —À— | —————d 
1(К, 2) 4рскт I +(2-р) P 


-L 


E s (E ы) M Е = £) 
4рскт R R 





Such a calculation can be used to model the temperature due to a heated pipe or 
cable buried underground or diffusion of contaminant from a section of a steadily 
leaking pipe. The effect of the burial of a line source at a distance below a surface with 
a fixed temperature can be calculated by adding a parallel line sink at an equal distance 
above the surface (see Exercise 66). Note that for large R the square bracket behaves 
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like 2L/R to first order so that at large distances the line source just looks like a point 
source with strength 2Lq;. A wide range of applications of these ideas can be found in 
H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, Oxford University Press, 


New York, 1959. 


9.7.4 Exercises 


Use the Poisson formula (9.69) to solve the Laplace 
equation in the disk r S a with the temperature given 
as T= To for 0 < 0 « nand T=0 forn < 0 « 2m, 
where (r, 0) are plane polar coordinates. 


Show that 
u(x, y) = Inf - a}? + (y ~ Y] 


satisfies the Laplace equation at all points except 
(a, b). Check that the function 


G(x, Y; Xo, Yo) = 


í „10 = хо)? +(у yo) lGc Exo)" "FQ yo) ] 
4" |[(x - xo) Q y») MG xo) Q7 »9)] 


satisfies all the properties of the Green's function 
for the Dirichlet problem for the Laplace equation 
in the quarter region x — 0, y — 0. Hence solve 
the Laplace equation V?T — 0 in the region x = 0, 
y = 0 with the boundary conditions T(x, 0) = f(x) 
for x = 0 and 7(0, y) = g(y) for y = 0 and T remains 
bounded at infinity. Show that 


T (Xo Yo) = 


Аа ра 
:| Ims 3X (tay)? +92 jo И 
0 


sro dy 
TJ (O tyo) txo (у-ро) x» 
0 


Evaluate T when f(x) = 1 and g(y) = 0. 


Green’s function of the Dirichlet problem for the 
Laplace equation in the disk, r S a, can be written in 


67 


terms of polar coordinates of the point (ry, 05) and 
its inverse point (a^/ry, Oy). Check that the function 


G(r, Ө; Po, Ө) = 
2 ^ оға? 
aut += - ——cos(0 - 0o) 
r 
1 Fo ro 0 


4п |a Pm -2rrcos(0- 6;) 


satisfies the conditions of Green's function 

with G = 0 on ry = a. Deduce that the solution 

of the Laplace equation in the region r « a and 

и(а, 6) = f(0) is given by the Poisson formula (9.69). 


Find the steady temperature T(x, y, z) due to a 
constant line source of length 2L, placed at x = a, 
y=0,-L Sz SL with the plane x = 0 maintained at 
zero temperature. Use the result in Example 9.44. 


A uniform ring source consists of instantaneous point 
sources at the points of the circle z 2 0, 3? - ? 2 à? or 
x=a cos 0, y =a sin 0. Each element ofthe ring, ad0, 
releases an amount of heat gad@ at time t = 0. Use 
(9.76) to show that the temperature at any point 

(А соѕ ф, R sin @, z) is 


T(R cos @, R sin 6, z) 

_ __42та ( R+? +) A 

= AET exp во ра 
8рс(ткї)* Akt 2кї 


where /, is a modified Bessel function, which is a 
known function available in MAPLE and MATLAB. 
It can be defined as 


2x 


140) = x | exp(@ cos y) dy 


0 
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| 98 General considerations 


9.8.1 


There are properties of a general nature that can be deduced without reference to any 
particular partial differential equation. The formal classification of second-order equa- 
tions and their intimate connection with the appropriateness of boundary conditions 
will be considered in this section. The much more difficult problems of the existence 
and uniqueness are left to specialist texts. 


Formal classification 


In the preceding sections we have discussed in general terms the three classic partial dif- 
ferential equations. We shall now show that second-order linear equations can be reduced 
to one of these three types. 

Consider the general form of a second-order equation: 


Au, + 2Buy, + Cu,, + Du, + Eu, + Р= 0 (9.80) 
where A, B, C,... are constants. If we make a change of variable 
r=axty, s=x+ by 


then the chain rule gives 
U,, = @u,, + 2аи„ + и„ 
Ux, = au,, + (1 + ab)u,, * bu, 
и = и, + 20и, + Ри, 
Substituting into (9.80) gives 
и„(Аа? + 2Ва + С) + 2и„(аА + B + abB + bC) + и„(А + 2Bb + DC) 
+ (aD + E)u,+ (D+ Eb)ju,+ F=0 
If we choose to eliminate the u, term then we must put 
a(A * bB) 2 -(B * bC) (9.81) 
and we can eliminate a by substitution to obtain 


2 
(4 +2bB + B°C)| u,, + AF +...=0 (9.82) 
(A + Bb) 


We can see immediately that the behaviour of (9.82) depends critically on the sign of 
AC — B? and this leads to the following classification. 


Case 1: AC —B’ > 0, elliptic equations 

On putting (AC — B’)/(A + Bby = 1’, (9.82) becomes 
Alus + Au,,) +...=0 

and on further putting q = r/A, 
U,, + Uyg t+... =0 


The second-order terms are just the same as the Laplace operator. Equations such as 
(9.80) with AC — B? — 0 are called elliptic equations. 


Example 9.45 


Solution 
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Case 2: AC — B* =0, parabolic equations 
In this case (9.82) simply becomes 
tees = 0 


with only one second-order term surviving. The equation is almost identical to the heat- 
conduction equation. Equations such as (9.80) with AC — B? = 0 are called parabolic 
equations. 


Case 3: AC —B’ < 0, hyperbolic equations 
On putting (AC — B’)/(A + Bby’ = -p’, (9.82) becomes 
O(u,, — Leu,,)+...=0 
and on further putting t = r/u, 
Uss — Up F... =0 
which we can identify with the terms of the wave equation. Equations such as (9.80) with 


АС — В? < 0 are called hyperbolic equations. 


Thus we see that simply by changing axes and adjusting length scales, the general equa- 
tion (9.80) is reduced to one of the three standard types. We therefore have strong 
reasons for studying the three classical equations very closely. An example illustrates 
the process. 


Discuss the behaviour of the equation 
и. + 2и, + 20и, = 0 


for various values of the constant а. 


In the notation of (9.80), 4 2 1, B= 1 and C= 2a, so from (9.81) 


- 1+ 2ab 
1+b 


and the change of variables r = ax + y, s = x + by gives 
20 —1 
5 Ur 
(1+ 2) 


r= 





Uss 


Thus if œ > апа а = (1 + Ь)/(2о— 1), we have the elliptic equation 
Uss + Ugg =O 

If a= >, we have the parabolic equation 
и, = 0 

Шо < i and t= r(1 + b)/\(1 — 2), we have the hyperbolic equation 


Uss — Up = 0 
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9.8.2 


In (9.80) the assumption was that 4, B, C, ... were constants. Certainly for many 
problems this is not the case, and A, B, ... are functions of x and y, and possibly u also. 
Therefore the analysis described may not hold globally for variable-coefficient equa- 
tions. However, we can follow the same analysis at each point of the region under 
consideration. If the equation at every point is of one type, say elliptic, then we call the 
equation elliptic. There are good physical problems where its type can change. One of 
the best known examples is for transonic flow, where the equation is of the form 


2 2 2 
(1 = i Jus 7 y + (1 Е us + f(y) =0 
Cc с с 


where u and v are the velocity components and c is a constant. We calculate 


АС В = |1 гү, v игү _ 1 шъ a È 
mT 2 "sw Wes —— 27 c 


C C C 


If we put q/c = M, the Mach number, then for M — 1 the flow is hyperbolic and supersonic, 
while for M < | it is elliptic and subsonic. It is easy to appreciate that transonic flows 
are very difficult to compute, since different boundary conditions and techniques are 
required on the subsonic and supersonic sides. 


Boundary conditions 


In the preceding sections we chose natural boundary conditions for the three classical 
partial differential equations. We can formalize these 1deas a bit further and look at 
appropriate boundary conditions and the consequences of choosing inappropriate 
conditions. We shall confine ourselves to two-variable situations, but it is possible to 
extend the theory to problems with more variables. 

Suppose that we are trying to obtain the solution u(x, y) to a partial differential equation 
in a region R with boundary C. The commonest boundary conditions involve и ог ће 
normal derivative Ou/on on C. The normal derivative (which is discussed in Section 3.2.1) 
at a point P on C is the rate of change of u with respect to the variable n along the line 
that is normal to C at P. The three conditions that are found to occur most regularly are 


Cauchy conditions 
u and ди given on C 
дп 


Dirichlet conditions 
u given on C 
Neumann conditions 


ди 


— given on C 
дп 8 


It is common for different conditions to apply to different parts of the boundary C. A 
boundary is said to be closed if conditions are specified on the whole of it, or open if 
conditions are only specified on part of it. The boundary can of course include infinity; 
conditions at infinity are specified if the boundary is closed or unspecified if it is open. 


Figure 9.56 

(a) A vibrating 

string fixed at its 
ends x = 0 and x =/; 
(b) the corresponding 
region and boundary 
conditions in the (x, £) 
plane; (c) wave 
moving forward 

with time. 
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Wave 
progresses 





Dirichlet condition 
Dirichlet condition 





Cauchy condition 


О u, duldt given 1 x OW“ AB x 


(b) (c) 





The natural conditions for the wave equation are Cauchy conditions on an open 
boundary. The d’ Alembert solution (9.15) in the (x, £) plane in Section 9.3.1 corresponds 
to u and du/dt given on the open boundary, t = 0. Physically these conditions correspond 
to a given displacement and velocity at time t = 0. However, the vibrations of a finite 
string, say a violin string, will be given by mixed conditions (Figure 9.56a). On the 
initial line 0 < x < /, t£ 2 0 (Figure 9.56b) Cauchy conditions will hold, with both u 
and du/dt given. The ends of the string are held fixed, so we have Dirichlet conditions 
и= О опх= 0, 1 2 О апаи= О опх = /, і 2 0). 

Figure 9.56 is typical of the hyperbolic-type equations in the two variables x and t 
that arise in wave propagation problems. For the second-order equation 


Au, + 2Bu,, + Cuy, = 0 


the characteristics are defined by 


dy В+ у(В? – АС) 
= 4 (9.83) 
For a hyperbolic equation, B? — AC 7 0, so there are two characteristics, which for 
constant A, B and C are straight lines. Each of the characteristics carries one piece of 
information from the boundary into the solution region. This is illustrated in Figure 9.56(c), 
where the solution at P is completely determined from the information on AB. The pair 
of characteristics, TC and SC, then allows us to push the solution further into the region. 
It is clear from the d' Alembert solution that Cauchy data is required on the line t = 0 
but a single condition is required on the lines x = 0, Z. 

There is no reason why the boundaries cannot be at infinity — an extremely long string 
can sensibly be modelled in this way. Care at such infinite boundaries must be taken, since 
the modelling of what happens there is not always obvious; certainly it requires thought. 

We have mentioned the commonest boundary conditions, but it is possible to 
conceive of others. However, such conditions do not always give a unique solution; a 
physical example will illustrate this point. 

Consider the problem in Example 9.1, which has the solution 


es (=) (==) 
и = u sin| = | cos| —— 
L L 


Suppose that a photograph of the string is taken at the times t = L/2c and t 2 3L/2c. Can 
the solution then be constructed from these two photographs? At the two times the 
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Figure 9.57 

(a) A heated bar with 
a temperature u — 0 at 
x — 0 and insulated, 
du/dx = 0, at x = L; 
(b) the corresponding 
region and boundary 
conditions in the (x, f) 
plane. (c) Solution 
can be computed at 
successive times. 


u=0[ Дашдх=0 
х=0 xb 


(a) 







Solution 
progresses 
with time 


! 





Dirichlet condition 
Neumann condition 


Dirichlet condition 


о u = f(x) given L x 


(b) (c) 





string has the same shape u = 0; that is, the string is in its resting position. One possible 
solution is therefore that the string has not moved. We know that a non-zero solution to 
Example 9.1 is possible, so we have two solutions to our problem, and we have lost 
uniqueness. Specifying the displacement at two successive times is not a sensible set of 
boundary conditions. Although we have stated an extreme case for clarity, the same 
problem is there for any 7, and non-unique solutions can occur if incorrect boundary 
conditions are imposed. 

The boundary conditions for the heat-conduction equation (9.7) for the one- 
dimensional case are given by specifying u or the normal derivative du/dn on a curve 
C in the (x, f) plane, that is Dirichlet or Neumann conditions respectively. Because there is 
only one time derivative in (9.7), we need only specify one function at t= 0 (say), rather 
than two as in the wave equation. In the simplest one-dimensional problem, at time ¢ = 0 
the temperature in a bar is given, u(x, 0) — f(x), and at the ends some temperature con- 
dition is satisfied for all time. Typical conditions might be u(0, t) = 0, so that the end x 
= 0 is kept at zero temperature, and Ou(L, t)/ox = 0, which implies no heat loss from the 
end x = L. The situation is illustrated in Figure 9.57. It is clear that, no matter what the 
starting temperature f(x) is, the solution must tend to u = 0 as the final solution. 

In the case of a parabolic equation we have В? – АС = 0, so the characteristics 
in (9.83) coalesce. Imagine that there are two characteristics very close together. The 
information on the boundaries will propagate a long way, since the two lines will meet 
*close to infinity'. We should therefore expect that information on the initial line would 
propagate forward in time, and because there is only one characteristic that one piece 
of information on the boundary curve would be sufficient. Figure 9.57(c) illustrates the 
situation, with the solution on CB being determined by a single boundary condition on 
each of CO, OA and AB. 

Again, as with the wave equation, there is no reason why the bar cannot be of infinite 
length, at least in a mathematical idealization, so that the initial curve C can include 
infinite parts. The conditions at infinity are usually quite clear and cause little difficulty. 

An interesting feature is that it is very difficult to integrate the heat-conduction equation 
backwards in time. Suppose we are given a temperature distribution at time t= T and seek 
the initial distribution at t — 0 that produces such a distribution of temperature at t — T. If 
there is an exact solution then the problem can be solved, but it is unstable in the sense 
that small changes at t = Т can lead to huge changes at t= 0. Consider, for instance, the 
solution to the heat conduction equation with k= 0.5 in the following two situations: 


Given u = 0 on x = 0 and 1, and at t= 5 


и(х, 5) = ѕіп(лх) е2" 


find u(x, 0). 


The solution is just one of the 
separated solutions in (9.40), namely 


u(x, t) = ѕіп(лх) е0" 
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Given u = 0 on x = 0 and 1, and at t= 5 


u(x, 5) = sin(2mx) e!™ 


find u(x, 0). 


The solution is just one of the 
separated solutions in (9.40), namely 


-20t 


u(x, t) 2 sin(2nx)e 


Att-5 u«14x109 
Att=0 uc-sin(2nx) 


Att=5 u<2x107! 
Att=0 u= sin(nx) 


The two conditions at ¢ = 5 differ by a very small amount (< 107’) but, integrating 
backwards to t= 0, the two solutions are significantly different. Although this analysis 
is physically artificial it indicates why integrating backwards in time is unstable. This 
phenomenon, in particular, leads to almost insuperable difficulties when a numerical 
solution is sought, since errors are inherent in any numerical method. Such a situation 
applies, for instance, when a space capsule is required to have a specified temperature 
distribution on reaching its final orbit. The designer wants to know an initial temperature 
distribution that will achieve this end. 

The boundary conditions most relevant to the Laplace equation are Dirichlet or 
Neumann conditions. These specify respectively u or the normal derivative du/dn 
on a closed physical boundary. One condition around the whole boundary, which may 
include an infinite part, is sufficient for this equation. Typically, on a rectangular plate 
as shown in Figure 9.58, the temperature is maintained at 1 on CD, at 0 on AB and AD, 
and there is no heat loss from CB. 

However, it should be noted that for Neumann conditions, du/ðn = f (s), on the whole 
boundary C, where s is a measure of length along the boundary, the function f (s) must 
satisfy an integral condition. Just consider the Laplace equation V7u = 0 in the region 
A with this boundary condition. In Section 3.4.5 Green’s theorem was written 


brass Оу = || 2. Sr dx dy 


C A 


This can be re-written by putting P = —(du/dy), Q = du/Ox to give 


Hay 4294, - || V'udxdy - 0 
ду Әх 
C A 


The right-hand side is put equal to zero since u satisfies the Laplace equation. Thus 


_ [ (ди ди\ ә, ay = 6 
-}(@ 3) ud E 


С С 


and therefore 


fo ds=0 


C 
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Figure 9.58 Typical 
boundary conditions 
for the Laplace 
equation in a 
rectangular plate. 


Figure 9.59 
Appropriateness of 
boundary conditions 
to the three classical 
partial differential 
equations (adapted 
from P. M. Morse and 
H. Feshbach, Methods 
of Theoretical Physics, 
Volume I. McGraw- 
Hill, New York, 1953). 


For the steady heat-conduction interpretation, 0u/dn is proportional to the heat entering 
through an element of the boundary. This result says that for a steady state to be 
achieved the net amount of heat entering the region must be zero. 

Figure 9.58 1s typical of an elliptic equation where we have conditions on a closed 
boundary. In (9.83) we have the condition that B? — AC « 0, so the characteristics 
associated with the solution do not make sense in the real plane. There is no ‘time’ in 
elliptic problems; such problems are concerned with steady-state behaviour and not 
propagation with time. We are dealing with a fundamentally different situation from 
the hyperbolic and parabolic cases. The interpretation of the idea of characteristics is 
unclear physically, and does not prove to be a useful direction to explore, although 
advanced theoretical treatments do use the concept. 


Dirichlet condition 


Neumann condition 
il 
© 


= 
2 
3 
с 
о 
о 
3 
= 
o 
E 
2 


Dirichlet condition 


и=0 B x 





It is possible to solve the Laplace equation with other boundary conditions, for instance 
Cauchy conditions u and Qu/àx on the y axis. However, an example due to Hadamard 
(see Exercise 49) shows that the solution is unstable in the sense that small changes in 
the boundary conditions cause large changes in the solution. This type of problem is not 
well posed, and should not occur in a physical situation; however, mistakes are made 
and this type of behaviour should be carefully noted. 

Figure 9.59 gives in tabular form a summary of the appropriate boundary conditions 
for these problems. 














Data Boundary | V7u = uy Уи = 0 Уу = и, 
Hyperbolic Elliptic Parabolic 
Dirichlet Open Insufficient Insufficient Unique, stable 
ог data data solution for t > 0 
Neumann 
Closed Not unique Unique, stable 
(to an arbitrary : 
constant in the Overspecified 
Neumann case) 
Cauchy Open Unique, Solution may 
stable exist, but is Overspecified 
unstable 
Closed Overspecified Overspecified Overspecified 
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9.8.3 Exercises 


Determine the type of each of the following 
partial differential equations, and reduce them 
to the standard form by change of axes: 





(a) uy 2u, tu, —0 

5: p. 
(b) ш. + 2и, + 5и, + Зи +и= 0 
(с) Зи, – 5и, – 2и, = 0 
Find the general solution of the equation 
Exercise 68(c). 
Use the change of variable u = x + y, v = x — y to 
transform the partial differential equation 

S 


2 2 2 
2f 3 2L + = 


i , (9.84) 
ox Oxdy dy 


to 


2 
IÍ -o 
àv? 


Hence compute the general solution of equation 
(9.84) as 74 


f G - y)FGx + у) + С(х + у) 


where F and G are arbitrary functions. 


Establish the nature of the Tricomi equation 


Уи + Uyy = 0 


in the regions (a) y > 0, (b) y 2 0 and (c) y « 0. 
Use (9.83) to determine the characteristics of the 
equation where they are real. 


Verify that the function f [4x? - (B/2)] y(1 — y?) 
with A and B constants, satisfies the partial 
differential equation 


2 2 
“© г -y = 0 
д ду 


X 


In which regions is the equation elliptic, parabolic 
and hyperbolic? 


Determine the nature of the equation 
2 2 2 
ааа азд 


24 
др? рда д4 4 


Show that if p = 1 (3? — y?) and q 2 1G + y’), 
the equation reduces to the Laplace equation 
in x and y. 


Show that the equation 


0 


2 
ди _ 
2 


дх 


ду? 


is hyperbolic. Sketch the domain of 
dependence and range of influence from 
the characteristics. 


9.9 Engineering application: MEV ER yg) Eee iar. 


moving load 


A wide range of practical problems can be studied under the general heading of moving 
loads. Cable cars that carry passengers, buckets that remove spoil to waste tips, and 
cable cranes are very obvious examples, while electric train pantographs on overhead 
wires are perhaps less obvious. Extending the problem to beams opens up a whole 
range of new problems, such as trains going over bridges, gantry cranes and the like. 
An excellent general discussion and wide range of applications is given by L. Fryba, 
Vibration of Solids and Structures under Moving Loads (Noordhoff, Groningen, 1973), 
and Initial Value Problems, Fourier Series, Overhead Wires, Partial Differential Equations 
of Applied Mathematics, Open University Mathematics Unit M321, 5, 6 and 7 (Milton 
Keynes, 1974) treat pantographs on overhead wires. 
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Figure 9.60 Moving 
load across a taut wire. 





A straightforward linearized theory of a cable tightly stretched between fixed 
supports provides important information about the behaviour of such systems. Certainly, 
if such a theory could not be solved then more complicated problems involving 
large deformations, slack cables or beams would be beyond reach. The basic assump- 
tions are 


(a) the deflections from the horizontal are small compared with the length of the cable; 
(b) deflections due to the weight of the cable itself are neglected; 


(c) the horizontal tension in the cable is so large compared with the perturbations 
caused by the load that it may be regarded as constant. 


Figure 9.60 shows the situation under study and the coordinate system used. 
Because the problem is one of small deflections, the basic equation is the wave 
equation with a forcing term from the moving load: 
2 2 
ge + Е = р(х, і) 
Ox дї 


Since the two ends are fixed, we have 
200,0) = (1,0) 20 (72 0) (9.85) 


A trolley is assumed to start at x = 0, with the cable initially at rest; that is, 
2(х, 0) = 2 zs, 0020 (0zx«xl) (9.86) 


It remains to specify the forcing function p(x, f) due to the moving load. We use the 
simplest assumption of the delta function and step function, as defined in Section 5.5, 
namely 


2 2 
-292 92. Poe - z) H(I- x) (9.87) 
ox дї v 

The delta function models the impulse of the trolley at time t at distance x, while the 

step function switches off the forcing function when the trolley reaches the end x = /. 
There are several ways of solving this equation, but here the Laplace transform 

method will be used. Taking the transform of (9.87) using (9.30) and (9.32) together 

with the initial condition (9.86), we obtain the ordinary differential equation 


-CZ" + 8Z = Pe?" Hl — x) 


Since we have no interest in the case x > /, the final term can be omitted, since it is just 
l ifx < land 0 if x > l. It is now straightforward to solve this equation as 


—sx/v 
Z-Ae?* 4 Be + : Pe = 
s(1- cv) 

Before evaluating A and B, it is clear that the speed v = c causes problems, since the 
third term is then infinite, and the solution is not valid for this case. The solution is 
going to depend on whether the trolley speed is subcritical v < c or supercritical v > c. 

Equation (9.85) gives Z(0, s) 2 Z(I, s) 2 0 for the boundary conditions, so that 4 and 

B can now be evaluated from 
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Р 


0=A+B+ —— 
s*(1- ch) 


P e Ur 


0 = Де! + Ве + 
s(1— c) 


Some straightforward algebra gives A and B, and hence Z, as 


Z=- P — p Lg sinh(sx/c) " sinh[ s(x — 2d) 
s(l-c/v) sinh(s//c) sinh(s//c) 


It is easy to check that the two boundary conditions at x = 0 and x = / are satisfied. As 
with all transform solutions, the main question is whether the inversion can be per- 
formed. Fortunately the three terms can be found in tables of transforms, to give 


nu T 
aq DGA Agee- L) 


+ (H) AS а nn(x — I) | an| 276! (9.88) 
1 сп? п? l 1 


п=1 


The three terms can be identified immediately. The first is the displacement caused by 
the trolley moving with speed v and hitting the value x after a time t = x/v; the second 
term only appears for t — //v, and gives the reflected wave from x = /; while the third 
term is the wave caused by the trolley disturbance propagating in the cable with wave 
speed c. 

To look a little more closely at the solution (9.88), we shall consider the case x — i ГА 
Thus the motion of the midpoint will be considered as a function of time. Plotting such 
waves is easier in non-dimensional form, so we first rewrite (9.88) in terms of 


2 
p=(1-§), quU ыс 
PI U l v 


so that D is the non-dimensional displacement, T is the non-dimensional time, with T= 1 
corresponding to the time for the wave to propagate the length of the cable, and A is the 
ratio of wave speed to trolley speed. The second step is then to take x — H to give 


р= (т- 1А)Н(т- 1А) 


- H(1- ZH A) - 2 [sina(« - A) — isin 3n(t — A) 
n 
* isin5n(t—- А) lf ir A (sina - ssin3nt + 3 sin 507...) 


In Figure 9.61 the supercritical case, A = 0.3, is displayed. It may be noted that the 
three terms ‘switch on’ at times 7 — 0.15, 7 — 0.8 and tT = 0.5 respectively, corresponding 
physically to the trolley hitting, the reflected wave arriving and the initial wave arriving. 
The motion is subsequently periodic, as indicated in the figure. 
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Figure 9.61 Solution 
of the moving-load 
problem for x = }/ 
and A = 0.3; the 
supercritical case. 


Figure 9.62 Solution 
of the moving-load 
problem for x = Гапа 
A = 3; the subcritical 
case. 





Similarly, the subcritical case 2 = 3 is shown in Figure 9.62. Here the switches are 
at T= 1.5, 2.5 and 0.5 respectively for the three terms, with the same interpretation as 
above. Because of the very odd choice of parameter A, only one pulse is seen at the 
centre point, with the terms subsequently cancelling. 


1.0 
0.8 
0.6 
0.4 
0.2 


-0.2 
-0.4 
-0.6 
—0.8 
-1.0 





While the model illustrates many of the obvious properties of wave propagation, it 
clearly has its limitations. The discontinuous behaviour in the gradient of the displacement 
looks unrealistic, and the absence of damping means that oscillations once started con- 
tinue for ever. It is clear that more subtle modelling of the phenomenon is required to 
make the solutions realistic, but the general behaviour of the solution would still be followed. 


9.10 Engineering application: R ROAU 


A problem of considerable interest is how to deal with the flow of a fluid through a tube 
with distensible walls and hence variable cross-section. An obvious application is to 
the flow of blood in a blood vessel. The full Navier-Stokes equations for viscous flow 
are difficult to solve and the distensible wall, where boundary conditions are not clear, 


Figure 9.63 

An element of the 
flexible tube in the 
blood-flow problem. 
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makes for an impossible problem. An alternative, simpler and more heuristic approach 
is possible and some useful solutions can be deduced. The work is based on a paper by 
A. Singer (Bulletin of Mathematical Biophysics 31 (1969) 453—70), where more details 
can be found about the practical application to blood flows. 

The assumptions required to set up the model are as follows: 


(1) the flow is one-dimensional; 

(2) the flow is incompressible and laminar; 

(3) the flow is slow, so that all quadratic terms can be neglected; 

(4) the resistance to flow is assumed to be proportional to the velocity; 

(5) there is a leakage through the walls that is proportional to the pressure; 
(6) the cross-sectional area S is a function of the pressure only. 


We take the situation illustrated in Figure 9.63, and we denote the pressure by p, the 
velocity by v, the time by t and the axial distance along the tube by x. The first equation 
that we derive is a continuity equation, which states that in a time Af the fluid that comes 
into the element must leave the element: 


(Snar = 5)Ах = –(05),,А + (vS),At — gpS AxAt 


volume volume volume out of volume into leakage 
after before right-hand end left-hand end 


The proportionality constant g is the leakage per unit volume of tube per unit time. The 
equation can be rewritten as 


Sua — S, , (0S) — (08), 
Dear T Вр а ОД | е _() 
At Ax ae 


or 


95, a 

— + —(vS) + 2р5 = 0 9.89 
y t REO ee (9.89) 
A second equation is required to evaluate v, and this comes from Newton’s law that the 
force is proportional to the rate of change of momentum. The force in the x direction 


acting on the element in Figure 9.63 is 


force= (pS), — (PS)aax — vrSAx 


pressure force pressure force resistance 
on left-hand on right-hand 
end end 


where r is the resistance per unit length per unit cross-section per unit time, and is the 
proportionality constant in assumption (4). The change in the momentum in time Af is 
more difficult to compute because of the convection due to the moving fluid. However, 
these effects only involve second-order terms, and hence can be omitted by assumption 
(3). The calculation is straightforward under this assumption, so that 


Flow 


x x+Ax 
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AM - change in momentum — p(v,4)$Ax — p(v)S Ax 


momentum before momentum after 
where p is the density of the fluid. Thus 


ӘМ _ дь 
дї =p ма 


Putting the force equal to the rate of change of momentum, we obtain, on taking the 
limit as Ax — 0, 


w IPS) 
O ери (9.90) 


Now assumption (6) gives S — S( p), so that 


1S _ 14S dp 
Sox Sdp ox 
1S _ 145 Фр 
Sot Sdp ot 


We define c — (1/S) dS/dp as the distensibility of the tube, that is the change in S per 
unit area per unit change in p. Equations (9.89) and (9.90) become 


ср, +, + сро + ор = 0 
pv, t р, + cp,p t rv-0 


and since the terms vp, and pp, can be neglected by assumption (3), we arrive at our 
final equations 


tw-tgp-0 
ср, + v, * gp | (9.91) 


pvutp,trv-0 


These are the linearized flow equations, and are identical with the transmission line 
equations describing the flow of electricity down a long, leaky wire such as a trans- 
atlantic cable (see Exercise 10). 

We can now look at special cases that will prove to be very informative about the 
various terms in the equation. 


Case (i): c = constant, r = g = 0 


This case corresponds to constant distensibility, which in turn gives S = Ae”, since S 
must satisfy c — (1/S) dS/dp. Thus we have made a specific assumption about how S 
depends on p. The r = g = 0 implies the absence of resistance and leakage. Eliminating 
p between the two equations in (9.91) gives 


Us = (cp)v, 


which is just the wave equation. We know that any pulse will propagate perfectly with 
a velocity u = 1/,(cp). The assumption in the problem is that the tube is one-dimensional 
and has no branches. Clearly a heart pulse will propagate to the nearest branch, but 
there will then be reflection and a complicated behaviour near the branch. In long 
arteries like the femoral artery the theory can be checked for its validity. 


Figure 9.64 
Development of 

the solution to the 
blood-flow problem 
from a delta function 
for successive times f, 
t and ts. 
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Case (ii): S — constant 


Here we are considering a rigid tube where the cross-sectional area does not vary, and 
hence c = 0. Eliminating p between the two equations in (9.91) gives 


pu, - 1 t+rv=0 
8 


Substituting v 2 Ve""?, we have 


AEE 
g 


so that 


which is just the diffusion equation. The solution for this rigid-tube case is therefore 
a damped, diffusion solution. Typically, if we start with a delta-function pulse at the 
origin then it can be checked that the solution is 


/p 
e -pgx?/4t 
v=A 1/2 i 
t 


where A is a constant. This solution is plotted in Figure 9.64, and shows the rapid damp- 
ing. Such a pulse would be most unlikely to propagate far enough for blood to reach the 
whole of the system. 


The two cases considered are extremes, but, just from the analysis performed, some 
conclusions can be drawn. If there is no distensibility then pulses will not propagate but 
will just diffuse through the system. We conclude that to move blood through the sys- 
tem with a series of pulses is not possible with rigid blood vessels, and we need flexible 
walls. Certainly for older people with hardening of the arteries, a major problem is to 
pump blood round the whole system, and this fact is confirmed by the mathematics. 

The actual situation is somewhere between the two cases cited, but there are no 
simple solutions for such cases except for the ‘balanced line’ case when cr = gp (see 
Exercise 10(c) and Review exercise 20). Singer solves the equations numerically for 
data appropriate to a dog aorta, and compares his results with experiment. Although the 
agreement is good, there are problems, since there appears to be a residual pressure 
after each pulse. The overall pressure would therefore build up to levels that are clearly 
not acceptable. 
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9.11 Review exercises (1-21) 


1 A uniform string is stretched along the x axis and its 
ends fixed at the points x = 0 and x = a. The string 
at the point x =b (0 < b <a) is drawn aside through 
a small displacement € perpendicular to the x axis, 
and released from rest at time ¢ = 0. By solving the 
one-dimensional wave equation, show that at any 
subsequent time ¢ the transverse displacement y is 
given by 


y= S288 — y Las (2m) a (un) 
Té b(a - b) a a 


2 
n=1 n 


ntct 
x со 0101) 
а 


where c is the transverse wave velocity in the string. 


2 The function (x, t) satisfies the wave equation 
2 2 
29.92€ (£2 0,0 x «& I) 
Ox ot 
and the conditions 
$6,0)2x' (QExxI) 
296,070 (0=х=) 
000, =0 (712 0) 
960-21 (t5 0) 
Әх 


Show that the Laplace transform of the 
solution is 


2 
а s(x - I) 
S 


S cosh s/ 


КЕ 


Using tables of Laplace transforms, deduce that the 
solution of the wave equation is 


2 e n 
21-21 у (Ei) T 
av mm (ura dL) 2] 


X cos 


S > 
21 


3 The damped vibrations of a stretched string are 
governed by the equation 


ПОА ee 
noma о 


where (x, f) is the transverse deflection, f is the 
time, x is the position coordinate along the string, 
and c and 7 are positive constants. A taut elastic 
string, 0 < x < /, is fixed at its end points so that 
у(0) = (J) = 0. Show that separation of variable 
solutions of (9.92) satisfying these boundary 
conditions are of the form 


y,(x, t) = T,(t) sin( (n=1,2,...) 


where 
ПИЧ ал ет и 
асар Бе ce E LE = 


0 
C di et dt lig 


Show that if the parameters c, tT and / are such 
that 2mct > /, the solutions for Т, аге all of the 
form 


T,(t) = & ""*(a, cos @„ + b, sin @,f) 


where 


nt r 1/2 
5) 


1 Ar n^ c^ c 


and a, and 5, are constants. 

Hence find the general solution of (9.92) 
satisfying the given boundary conditions. 

Given the initial conditions y(x, 0) = 4 $1п(3лх//) 
апа (ду/д®), = 0, find y(x, д). 


A thin uniform beam OA of length / is clamped 
horizontally at both ends. For small transverse 
vibrations of the beam the displacement u(x, f) 
at time ¢ at a distance x from O satisfies the 
equation 


4 2 
Ox a Ot 


where a is a constant. The restriction that the 
beam is clamped horizontally gives the boundary 
conditions 


u=0, 9 =0 (х=0,/) 


Show that for periodic solutions of the type 


u(x, t) = V(x) sin(@t + €) 


where wand € are constants, to exist, V must satisfy 
an equation of the form 


dy E (9.93) 
dx 
where o* — (0a), and the boundary conditions 
W(0) 2 V'(0) 2 V(I) 2 V'(I) 20 
Verify that 
V — Acosh ax + Bcos ax + Csinh ax 
+ Dsin ax 


where A, B, C and D are constants, satisfies (9.93), 
and show that this function satisfies the boundary 
conditions provided that 


B=-A, D=-C 
and ar is a root of 


cos a/cosha/= 1 


In a uniform bar of length / the temperature O(x, f) 
at a distance x from one end satisfies the 
equation 


28. здө 
ax? дї 


where a is a constant. The end x = l is kept at 
zero temperature and the other end x = 0 is 
perfectly insulated, so that 


605-0, 20,)=0 @>0) 
Using the method of separation of variables, 


show that if initially the temperature in the bar is 
0(x, 0) 2 f(x) then subsequently the temperature is 


0G, n) = У Ana оу ш 


п=0 


22 
х op Cus г 
4a'l 


where 


1 
Ама = | f(x) cos zn dx 
0 


Given 0(x, 0) = @(/ — x), where 6, is a constant, 
determine the subsequent temperature in the bar. 
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Prove that if z 2 x//t and $(x, f) — f(z) satisfies the 
heat-conduction equation 


2 
s - = (9.94) 
А t 


then f(z) must be of the form 


f@=A et (E) +В 


where A and B are constants and the error 
function is defined as 


$ 
2 


^ne 
«o-2| e" du 


0 


A heat-conducting solid occupies the 
semi-infinite region x 7 0. At time t — 0 the 
temperature everywhere in the solid has the 
value 7). The temperature at the surface, x = 0, 
is suddenly raised, at t — 0, to the constant value 
T, + Q and is then maintained at this temperature. 
Assuming that the temperature field in the solid 
has the form 


Т= ф(х, t) + Ty 


where 6 satisfies (9.94) in x — 0, find the solution 
of this problem. 


Use the explicit method and the Crank-Nicolson 
formula to solve the heat-conduction equation 


д0 _ д0 
dx! at 


given that @ satisfies the conditions 


=i (OSes =) 


39 [6 (70 ,. 
Әх И э 


Compute ф(х, f) at x = 0, 0.2, 0.4, 0.6, 0.8, 1 when 
t = 0.004 and t= 0.008. 


An infinitely long bar of square cross-section has 
faces x = 0, x =a, y= 0, y =a. The bar is made of 
heat-conducting material, and under steady-state 
conditions the temperature T satisfies the Laplace 
equation 


2) 2 
бы ш 


0 
Әх? oy’ 


840 


10 


i 


12 
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All the faces except y = 0 are kept at zero 
temperature, while the temperature in the face 
y = 0 is given by 7(x, 0) = x(a — x). Show that 
the temperature distribution in the bar is 


8а? z sin[(2r + 1)nx/a] ѕіпћ (27+ 1)л(а - y)/a] 
ТЕЕ (2r 4 1) sinh(2r 4- 1)n 


(Harder) The function @ satisfies the Poisson 
equation 


ag + ao = xy 
дх ду 


inside the area bounded by the parabola y? = x and 
the line x = 2. The function @ is given at all points 
on the boundary as @= 1. By using a square grid of 
side i and making full use of symmetry, formulate 
a set of finite-difference equations for the unknown 
values @, and solve. 


A semi-infinite region of incompressible fluid of 
density p and viscosity LL is bounded by a plane wall 
in the plane z 2 0 and extends throughout the region 
z = 0. The wall executes oscillations in its own 
plane so that its velocity at time tis U cos wt. No 
pressure gradients or body forces are operative. It 
can be shown that the velocity of the liquid satisfies 
the equation 


where v — u/p. Establish that an appropriate 
solution of the equation is 


u= Ue™ cos(@t — az) 


where @ = |(@/2v). 


Determine the value of the constant k so that 
U = ther 
satisfies the partial differential equation 


1.2 (280) 20 
ror or ot 


Sketch the solution for successive values of f. 


The function z(x, y) satisfies 


Ол КОСЕ 
mio и 


13) 


14 


15 


with the boundary conditions 
z=2x wheny=-x (х 2> 0) 


Find the unique solution for z and the region in 
which this solution holds. Check the solution 
using MAPLE. 


The function @(x, y) satisfies the Laplace equation 


2 2 
дх ду 


in the region 0 « x « л, 0 < y, and also the 
boundary conditions 


$0 asy œ 
(0, y) - O(n, y) - 0 


Show that an appropriate separation of variables 
solution is 


ф= » c, sin(nx) e 


п=1 
Show that if further 
(x, 0) = x(t — x) 
then c;,, = 0 while the odd coefficients are given by 


T RET RTT 
n(2m 41) 


Comi 


The boundary-value problem associated with the 
torsion of a prism of rectangular cross-section 
—a S x S a, —b S y X b entails the solution of 


2 2 
94 i 2X 22 
Ох oy 
subject to y — 0 on the boundary. Show that the 


differential equation and the boundary conditions 
on x = +a are satisfied by a solution of the form 


= Ga ` Ал co 20 Ly 
2a 


n=0 


2a 


- cos +1 = 


From the condition y = 0 on the boundaries y = +b, 
evaluate the coefficients A,,,,. 


When 0 < x < 1 and¢ > 0 the function u(x, 0) 
satisfies the wave equation 


16 


17 


18 


u_u 


dx ar 
and is also subject to the following boundary 
conditions: 


(а) w(0, f) 2 u(1, ) = 0 forallt — 0 
b) 2G, 0)=0 (0<x<1) 


(9) Бойу = (Oh 10) 


Use the separation method to find the solution for 
u(x, t) that is valid for 0 € x < 1 and t — 0. 


The excess porewater pressure u(z, f) in an infinite 
layer of clay satisfies the diffusion equation 


(t>0,0<z<h) 


du_ au 
Cae 
ot д2? 


where f is the time in minutes, z is the vertical 
height in metres from the base of the clay layer 
and c is the coefficient of consolidation. There is 
complete drainage at the top and bottom of the clay 
layer, which is of thickness h. The distribution 

of excess porewater pressure u(z, ft) is A att=0 
where 4 is a constant. Show that 


WU E »y sin[(2n 4 1 )nz/A] g onte 
п=0 


2п+1 


By seeking a separated solution of the form 
ф = X(x)T(t), find a solution to the telegraph 
equation 


д? 1 (9° д 
С 
Ох te ahi ot 


satisfying the conditions 


(a) 6$ 2 Acos px for all values of x and for t= 0 
for the case when c’p? > } K’; 
(6) ф= 4 апа 20/01 = -}AK for x = 0 and t= 0. 


For the two-dimensional flow of an incompressible 
fluid the continuity equation may be expressed as 
д дш 
(A) r= a0) 
or (7) 00 
where r and @ are polar coordinates in a 
plane parallel to the flow, and v, and v, are the 
respective velocity components. Show that a stream 
function y such that 


20 


21 
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na LA 
290 
ду 
Ug- - 
° Ər 
satisfy the continuity equation. 
Take 


2 
y= Ur sin 0- Ua sing 
7 
and interpret the solution physically. 


(An extended problem) Section 9.9 looked at 
wave propagation caused by moving loads on 
cables. For loads on beams a similar analysis 
models such problems as trains going over 
bridges or loads moving on gantry cranes. 
Use a similar analysis for the beam equation 


4 2 
с л 


(An extended problem) In the blood-flow model 
in Section 9.10 consider the following cases: 
(a) S= constant, g = 0 for a pulsating flow 
v-—vgel" atx = 0 for all t 
(b) S= constant, r = 0 for a pulsating flow 
v=v,ei" atx = 0 for all t 
(c) the balanced-line case when gp = rc. Show that 
v 2 e *"*U gives 
cae leat) 
dt cp ax’ 
Solve the equation and interpret your solution. 
(An extended problem) Fluid flows steadily in the 
two-dimensional channel shown in Figure 9.65. 


The temperature 0 — O(x, f) depends only on the 
distance x along the channel and the time f. The 


x 0' x+6x 


Figure 9.65 An element of the channel in Review 
exercise 21. 
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fluid flows at a constant rate so that an amount L 
crosses any given section in unit time. The specific 
heat of the fluid is a constant c, and the heat H in a 
length óx with cross-section S is therefore 


Н = с(5 8х) 


Heat is transferred through the walls of the 
channel, AB and DC, at a rate proportional to the 
temperature difference between the inside and 
outside. Heat conduction in the x direction is 
neglected. Show that the heat balance in the 
element ABCD leads to the equation 
ОЕ ов ео ъа 0) 
дї дх 

This type of analysis can now be applied to the 
long heat exchanger illustrated in Figure 9.66. The 
configuration is considered to be two-dimensional 
and symmetric with respect to the x axis; in the 
inner region the flow is to the right, while in the 
outer regions it is to the left. The regions are 
separated by metal walls in which similar 
assumptions to the above are made, except 
of course there is no fluid flow. 

Set up the equations of the system in the form 


00 90 

eS; F = -Laz * 2k(0, — 0.) 
90 

25, = = (0 - 6) + (0; — 6)) 
90 90. 

су5» E = Lo + 000, – 6) 


Өз(х, t) -——— L4 
MLL) ULL 


A(x, 0) 
VMAS itll 


Өзбх, 10) <— L; 


Figure 9.66 Heat-exchanger configuration in 
Review exercise 21. 


where the assumption is made that there is no 
heat flow through the outside lagged walls. Solve 
the steady-state equations and fit the arbitrary 
constants to the conditions that at the inlet 
(x = 0) the fluid enters the inner region at a given 
temperature 0, — 7j, while at the outlet (x > ©) 
the fluid in the outer regions enters at a given 
temperature 0, = 7;. Find flow rates that ensure 
that this situation is possible, and discuss the 
implications of any results obtained. 

Discuss the assumptions made in setting 
up this problem, the limitations imposed by the 
assumptions, possible applications of this type 
of analysis, and extensions of the work, for 
example a time-dependent solution. 
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10.1 


Example 10.1 


Solution 


Introduction 


The need to get the ‘best’ out of a system is a very strong motivation in much of 
engineering. A typical problem may be to obtain the maximum amount of product or to 
minimize the cost of a process or to find a configuration that gives maximum strength. 
Sometimes what is ‘best’ is easy to define, but frequently the problem is not so clear 
cut, and a lot of thought is required to reach an appropriate function to optimize. In most 
cases there are very severe and natural constraints operating: the problem may be one 
of maximizing the amount of product, subject to the supply of materials; or it may be 
minimizing the cost of production, with constraints due to safety standards. Indeed, 
much of modern optimization is concerned with constraints and how to deal with them. 

We have seen in Chapter 9 of Modern Engineering Mathematics how to obtain the 
maximum and minimum of a function of many variables. However, the methods described 
there founder very quickly because it becomes impossible to solve the resulting 
equations analytically. A simple one-dimensional example soon shows that a numerical 
solution is required. 


Find the positive x value that maximizes the function 


_ tanh x 
1+х 


Equating the derivative to zero gives 


dy _ 0= (1 +x) sech’x - tanh x 
dx (1 9 xy 


so that we need to solve 
1+ х= 5 віпћ2х 


which has no simple positive solutions that can be obtained analytically. 


To solve such problems, a set of numerical algorithms was developed during the 
1960s as fast computers became available to perform the large amounts of arithmetic 
required. These algorithms will be described in Section 10.4. Perhaps the main stimulus 
for this development came from the space industries, where small percentage savings, 
achieved by doing some mathematics, could save vast amounts of money. The ideas 
were quickly taken up by ‘expensive’ areas of engineering, such as the chemical and 
steel industries and aircraft production. 

The idea of dealing with constraints is not new: Lagrange developed the theory of 
equality-constrained optimization around the 1800s. However, it was not until the 1940s 
that inequality constraints were studied with any seriousness. The use of Lagrange 
multipliers for equality constraints was also introduced in Chapter 9 of Modern Engin- 
eering Mathematics, and will be looked at again in more detail in Section 10.3 below. 


Example 10.2 


Figure 10.1 Waxed 
cardboard milk 
container opened up, 
with measurements in 
millimetres and with a 
5mm overlap. 


Solution 
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The only work on inequality constraints will be for linear programming problems in 
Section 10.2. Where inequality constraints are nonlinear, the problems become very 
difficult, and are the province of specialist books on optimization. Linear programming, 
however, is much more straightforward, and the basic simplex algorithm has been spec- 
tacularly successful — so successful in fact that many workers try to force their problems 
to be linear when they are clearly not. The computer scientist’s maxim GIGO (‘garbage 
in, garbage out’) is very applicable to people who try to fit the problem to the mathem- 
atics rather than the mathematics to the problem! 

Before considering detailed methods of solution of optimization problems, we 
shall look at a few examples. Let us first revisit an extended form of the milk 
carton problem considered in Example 8.34 (and illustrated in Figure 8.38) of Modern 
Engineering Mathematics. 


A milk carton is designed from a sheet of waxed cardboard as illustrated in Figure 10.1, 
where a 5mm overlap has been allowed. 
It is to contain 2 pints of milk, and we require the minimum surface area for the carton. 
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The only difference between this example and Example 8.34 of Modern Engineering 
Mathematics is that we no longer insist on a square cross-section. The total area in 
square millimetres is 


А = (2b  2w * S)(h * b 10) 
and the volume of the two-pint container is 
volume = /ibw 2 1136 000 mm 


We first note that a constraint, the given volume, occurs naturally in the problem. Because 
of its simplicity, we can eliminate w from the constraint to give 


2 272 000 


A=(h+b+10){ hb 


+2245) 


Following the standard minimization procedure and equating partial derivatives to 
Zero gives 
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Figure 10.2 
The moonshot 
problem. 





дА _ 2272000 2272000 


+2b+5-(h+b+10) s 0 
Əh hb hb 
дА _ 2272000 2b-+5 ~(h+b+10)(—2222000 5.2) —6 
ðb hb hb 


We therefore have two highly nonlinear equations in the two unknowns A and b, which 
cannot be solved without resorting to numerical techniques. We shall return to this problem 
later in Examples 10.11 and 10.17 to see how a practical solution can be obtained. 


Most practical optimization problems come from very expensive projects where 
savings of a few per cent can be very significant. Laying natural gas or water pipe 
networks are typical examples. Without considering the expense of installing com- 
pressors, the problem is to minimize the capital cost. This cost is directly related to the 
weight of the pipe, subject to constraints imposed by pressure-drop limitations, which in 
turn depend on the pipe diameter in a nonlinear way. Adding the compressors imposes 
further costs and constraints. 

Heat exchangers provide an example of a system where we try to remove heat. We 
design the flow rates, the pipe sizes and pipe spacing to maximize the heat transferred. 
A related heating problem might be the design of an industrial furnace. It is required 
that the energy consumption be minimized subject to constraints on the heat flow and 
the maintenance of various temperatures. 

A final example, the moonshot problem, illustrates a large-scale, very complicated 
problem that stimulated much of the recent developments in optimization (see Figure 10.2). 
Which path from a point on the Earth to a point on the Moon should be chosen to 
minimize the weight of fuel carried by a rocket? The complicated relation between the 
weight of fuel, the mechanical equations of the rocket and the path must be established 
before it is possible to proceed to obtain the optimum. The numerous constraints on the 
strengths of materials, the maximum tolerable acceleration etc. add to the difficulty 
of the problem. 

In the problems discussed we have assumed that an optimum exists at a point, and 
we have asked for the mathematical conditions that must hold. The other way round is 
much more difficult. Given that the appropriate conditions hold, does an optimum exist, 
and if so what type of optimum is it? For many simple finite-dimensional problems 


Moon 
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Example 10.3 


Solution 
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these conditions are known, but may not be very simple to apply. To serve as a reminder, 
the condition f’(0) = 0 is a necessary condition for a maximum to exist for the differ- 
entiable function f(x) at x — 0. It is not sufficient, however, as can be seen from the three 
functions fix) = xX, f(x) = х? апа f,(x) = —x’, which have respectively a minimum, a 
point of inflection and a maximum at the origin. In many dimensions the difficulties are 
similar, but much more complicated. 


Linear programming 


Introduction 


In Section 10.1 it was indicated that constraints are very important in most applications. 
When all functions are linear, there is an extremely efficient algorithm, developed by Danzig 
in the 1940s, which will be described for the linear programming (LP) problem. 

We shall start by posing a particular problem and looking at a simple graphical 
solution. 


A manufacturing company makes two circuit boards R1 and R2, constructed as follows: 


R1 comprises 3 resistors, 1 capacitor, 2 transistors and 2 inductances; 
R2 comprises 4 resistors, 2 capacitors and 3 transistors. 


The available stocks for a day's production are 2400 resistors, 900 capacitors, 1600 
transistors and 1200 inductances. It 1s required to calculate how many R1 and how many 
R2 the company should produce daily in order to maximize its overall profits, knowing 
that it can make a profit on an R1 circuit board of 5p and on an R2 circuit board of 9p. 


If the company produces daily x of type R1 and y of type R2 then its stock limitations give 


3x + 4y < 2400 (10.1a) 

х+2у< 900 (10.1Ь) 
2х + Зу < 1600 (10.1с) 
2х = 1200 (10.14) 
x20, y20 


and it makes a profit z given by 
2= 5х + 9у (10.2) 


These inequalities are plotted on a diagram as in Figure 10.3(a). The shaded region defines 
the area for which all the inequalities are satisfied, and is called the feasible region. The 
lines of constant profit z = constant, defined by (10.2), are plotted as ‘dashed’ lines in 
Figure 10.3(b). It is clear from the geometry that the largest possible value of z that 
intersects the feasible region is at S with x = 500, y = 200, and this gives the optimal 


848 OPTIMIZATION 








600 T 











400 600 800 


ы 





(а) (6) 


Figure 10.3 (a) Feasible region for the circuit board manufacture problem. (b) Lines of constant z show that S (500, 200) 


gives the optimum. 


Figure 10.4 Table 
stock usage. 


of 


solution. At this point we can analyse the usage of the stocks as in Figure 10.4 and note 
that a profit of £43 has been made. 





Available Used Left over 
Resistors 2400 2300 100 
Capacitors 900 900 0 
Transistors 1600 1600 0 
Inductances 1200 1000 200 


Example 10.3 has encapsulated much of the LP method, and we shall try to extract 
the maximum amount of information from this example. The graphical method will 
only work if the problem has two variables, so we need to consider how to translate the 
geometry into an algebraic form that will work with any number of variables. Although 
we shall concentrate in this chapter on small problems in order to illustrate the methods, 
in practical problems there can be hundreds of variables and constraints. Large prob- 
lems bring further difficulties that will not be considered here; for instance, how a large 
amount of information can be input into a computer accurately or how large data sets 
are handled in the computer. In the MATLAB implementation of LP, there is a specific 
option to deal with ‘LargeScale’ problems. 

From Figure 10.3 it can be seen that the solutions must be at a ‘corner’ of the feasible 
region, other than in the exceptional case when the profit line z = constant is parallel to 
one of the constraints. This follows through into many-dimensional problems, so that it is 
only necessary to inspect the corners of the feasible region. The simplex method, described 
in Section 10.2.3, uses this fact and selects a starting corner, chooses the neighbouring 
corner that increases z the most, and then repeats the process until no improvement is 
possible. The method writes the equations into a standard form; it then automates the 
choice of corner and finally reprocesses the equations back to the standard form again. 

Once a solution has been obtained, it may be observed from Figures 10.3 and 10.4 
that the binding constraints (b) and (c) intersect at S and are satisfied identically, so 
that all the stocks are used, while the non-binding constraints (a) and (d) leave some 
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stock unused. It can also be seen from Figure 10.3 that the constraint (a) is redundant 
since it does not intersect the feasible region. These might appear obvious comments, 
but they prove to be useful and relevant observations when a sensitivity analysis is 
performed. Such an analysis asks whether or not the solution changes as the stocks vary 
or the costs vary, or the coefficients are changed. In practice, parameters vary over a 
period, and we wish to know whether a new calculation must be performed or whether 
the solution that we have already obtained can be used. 
There are a whole series of special cases that we must consider: 


(a) Does a feasible region exist? If the system is modelled correctly, a feasible region 
always exists for a sensible problem; but if an error is made then it is easy to eliminate 
the region completely. For instance, in (10.1d) putting —1200 instead of 1200 would 
give an empty region, and this must be detected by any program. 


(b) Is the solution degenerate? If the profit lines z = constant are parallel to any one of 
the final binding constraints then any feasible part of the constraint gives a solution. 


(c) Can we get unbounded regions and solutions? Again it is very easy to construct 
problems where the regions are unbounded and finite solutions may or may not 
exist. Just maximizing (10.2) subject only to the single constraint (10.1d) provides 
an unbounded region. Interpreting this situation, the only constraint that we have 
is on the inductances. If this were true, we could make infinitely many circuits of 
type R2 and make an infinite profit! 


Simplex algorithm: an example 


We now need to convert the ideas of Section 10.2.1 into a useful algebraic algorithm. 
There is a whole array of technical terms that are used in LP, and they will be intro- 
duced as we reconsider Example 10.3 to develop the solution method. The first step is 
to introduce slack variables r, s, t and u into (10.1) to make the inequality constraints 
into equality constraints. 

If we are given x and y in the feasible region, the variables r, s, £ and u provide a 
measure of how much ‘slack’ is available before all the corresponding resource is used 
up, so 


3x + 4y +r = 2400 (10.3a) 

x+2y + 5 = 900 (10.3) 
2х + Зу *t = 1600 (10.3с) 
2х +и = 1200 (10.34) 


where x, y, r, s, t and u are now all greater than or equal to zero. We now have more 
variables than equations, and this enables us to construct a feasible basic solution by 
inspection: 


x-2y-20; r-22400, s—900, :—1600, u= 1200 
eS —— amamma 
non-basic basic variables 
variables 


with 4 basic variables (the same number as constraints, which are non-zero) and 2 non- 
basic variables (the remainder of the variables, which are zero). 
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The algebraic equivalent of moving to a neighbouring corner is to increase one of 
the non-basic variables from zero to its largest possible value. From the profit function 
given in (10.2), we have 

2= 5х + 9у 


Currently z has the value zero, and it seems sensible to change y, since the coefficient 
of y is larger; this will increase z the most. So keep x = 0 in (10.3) and increase y to its 
maximum value in each case: either 

(a) change y to 600 and reduce r to zero, or 

(b) change y to 450 and reduce s to zero, or 

(c) change y to 5331 and reduce f to zero, or 

(d) note that there is no effect on changing y. 

Choose option (b), since increasing y above 450 will make s negative, which would then 
violate the condition that all variables must be positive. Interchange s and y between the 


set of basic and non-basic variables and rewrite in the same form as (10.3). This is 
achieved by solving for y from (10.3b), y 2 450 — $x — is, and substituting to give 


x-2s+r = 600 (10.4a) 
iei dg = 450 (10.4b) 
5x — 38 +t = 250 (10.4c) 
2x +u= 1200 (10.4d) 

and, from (10.2), 
2= 4050 + іх – 3s (10.4е) 


The problem is now reduced to exactly the same form as (10.3), and the same pro- 
cedure can be applied. The non-basic variables are x = s = 0, and the basic variables are 
r = 600, y = 450, t = 250 and u = 1200, and z has increased its value from 0 to 4050. 

Now only x can be increased, since increasing the other non-basic variable, s, would 
decrease z. Increasing x to 500 in (10.4c) and reducing ¢ to 0 is the best that can be done. 
Using (10.4c) to write x = 3s — 2t+ 500, we now eliminate x from the other equations 
to give 


s-—2t+r = 100 (10.5a) 
2s- t. +у = 200 (10.5b) 
—3s * 2t +x = 500 (10.5c) 
6s — 4t +и= 200 (10.54) 
апа 
2= 4300 – 35 – 1 (10.5е) 


We now have the final solution, since increasing s or t can only decrease z. Thus we 
have x = 500, y = 200, which is in agreement with the previous graphical solution, the 
maximum profit is z = 4300 as before, and the amounts left over in Figure 10.4 are just 
the 100 and 200 appearing on the right-hand sides of (10.5a, d). 

We have just described the essentials of the simplex algorithm, although the method 
of working may have appeared a little haphazard. It can be tidied up and formalized 
by writing the whole system in tableau form. Equations (10.3) are written with the 
basic variables in the left-hand column, the coefficients in the equations placed in the 
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appropriate array element and the objective function z placed in the first row with minus 
signs inserted. 











Non-basic 
variables Basic variables 
x y r 5 і и Solution 
Objective function z —5 -9 0 0 0 0 0 
r 3 4 1 0 0 0 2400 
Basic variables | * l 2 0 1 0 0 900 
t 2 3 0 0 1 0 1600 
u 2 0 0 0 0 1 1200 





The current solution can easily be read from the tableau. The basic variables in the 
left-hand column are equal to the values in the solution column, so r = 2400, s = 900, 
t — 1600 and u 2 1200. The remaining non-basic variables are zero, namely x = y= 0. 
The profit z is read similarly as the entry in the solution column, namely z = 0. The 
negative signs in the z row ensure that z remains positive in the subsequent manipula- 
tion. It should be noted that a 4 x 4 unit matrix (shown shaded) occurs in the tableau 
in the basic variable columns, with zeros occurring above in the z row. This standard 
display is always the starting place for the simplex method, with the only possible 
complication being that the columns of the unit matrix might be shuffled around. The 
algorithm can now be performed in a series of steps: 


Step 1 


Choose the most negative entry in the z row and mark that column (the y column in 
this case). 


Step 2 


Evaluate the ratios of the solution column and the positive entries in the y column, 
choose the smallest of these and mark that row (the s row in this case). 








x y r 5 і и Solution 
2 —5 -9 0 0 0 0 0 Ratios 
3 4 1 0 0 0 2400 2400/4 = 600 
1 Ө) 0 1 0 0 900 900/2 = 450 
t 2 3 0 0 1 0 1600 1600/3 = 533; 
и 2 0 0 0 0 l 1200 B 





Step 3 


Change the marked basic variable in the left-hand column to the marked non-basic variable 
in the top row (in this case s changes to y in the left-hand column). 
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Step 4 


Make the pivot (the element in the position where the marked row and column cross) 
1 by dividing through. In this case we divide the row elements by 2. These series of 
steps lead to the tableau 








x y r 5 і и Solution 
z -5 -9 0 0 0 0 0 
1 3 4 1 0 0 0 2400 
y р 1 0 р 0 0 450 
t 2 2 0 0 1 0 1600 
и 2 0 0 0 0 1 1200 





Step 5 


Clear the y column by subtracting an appropriate multiple of the y row (this is just 
Gaussian elimination); for example, (z row) + 9 x (y row), (r row) — 4 x (y row) and 
so on. This leads to the tableau 








x y r 5 і и Solution 
2 =; 0 0 2 0 0 4050 
r 1 0 1 =2 0 0 600 
y j 1 0 i 0 0 450 
t р 0 0 -i 1 0 250 
и 2 0 0 0 0 1 1200 





This tableau can now easily be recognized as equations (10.4). It may be noted that the 
unit matrix appears in the tableau again, with the columns permuted, and the z row has 
Zero entries in the basic variable columns. 

The tableau is in exactly the standard form, and is ready for reapplication of the five 
given steps. Steps 1 and 2 give the tableau 








o y r 5 і и Solution 
z E 0 0 2 0 0 4050 Ratios 
r І 0 1 -2 0 0 600 600 
y 1 1 0 1 0 0 450 900 
2 2 
t © 0 0 — 1 0 250 500 
и 2 0 0 0 0 1 1200 600 
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Steps 3, 4 and 5 then produce a final tableau (compare with equations (10.5)) 








x y r 5 і и Solution 
2 0 0 0 3 1 0 4300 
r 0 0 1 1 -2 0 100 
y 0 1 0 2 -1 0 200 
x 1 0 0 -3 2 0 500 
u 0 0 0 6 —4 1 200 





All the entries in the z row are now positive, so the optimum is achieved. The solution 
is read from the tableau directly; the left-hand column equals the right-hand column, 
giving z = 4300, r = 100, y = 200, x = 500 and u = 200, which is in agreement with the 
solution obtained in Example 10.3. 


Simplex algorithm: general theory 
We can now generalize the problem to the standard form of finding the maximum of 
the objective function 

Z-OXpt606X, T... 6,X, 


subject to the constraints 


aiki t adpXod ...cT dy, = b, 


ахі + а» +...+а,х, € b, 








(10.6) 
ах + "e +...+@ьХ„ S bm 
by the simplex algorithm, where the b, b», . . . , b,, are all positive. By introducing the 
slack variables x,,,, ... , x,,,, = 0 we convert (10.6) into the standard tableau 
X, х, ле X; X X2 Kom Solution 
2 =c; =6 —C, 0 0 0 0 
X dj а Lux Qn 1 0 E 0 bi 
ма mo me A00 1 ss 0 | b 
s “i к, — Ann 0 б uA 1 b, 








Any subsequent tableau takes this general form, with an m x m unit matrix in the basic 
variables columns. As noted in the previous example, the basic variables change, so the 
left-hand column will have m entries, which can be any of the variables, x,, . . . , x,,,,. 
The unit-matrix columns are usually not in the above neat form but are permuted and 
hence the zeros of the ‘z’ row can be any of the | to n+m entries corresponding to the 
basic variables. 
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The five basic steps in the algorithm follow quite generally: 


Step 1 


Choose the most negative value in the z row, say —c;. (Identify column ;) 
If all the entries are positive then the maximum has 
been achieved. 


Step 2 


Evaluate bj/a,, bjla;, .. . , b,/a,; for all positive a. (Identify row j) 
Select the minimum of these numbers, say b;/a;;. 


Step 3 


Replace x„ by x; in the basic variables in the left- (Change the basis) 
hand column. 


Step 4 


In row j replace a, by a,/a; fork=1,...,n+m+1. (Make pivot = 1) 
(Note that the first row and the final column are treated 
as part of the tableau for computation purposes, —c, = do,, 


b, = Gre) 


Step 5 


In all other rows, l+ j, replace aj by ay — aja forallk— 1, ^ (Gaussian elimination) 
и andion eachtronw О A: 


The algorithm is then repeated until at Step 1 the maximum is achieved. The method 
provides an extremely efficient way of searching through the corners of the feasible region. 
To inspect all corners would require the computation of ("7") points, while the simplex 
algorithm reduces this very considerably, often down to something of the order of m + n. 

Several checks should be made at the completion of each cycle, since it may be 
possible to identify an exceptional case. Perhaps the most complicated of the exceptions 
is when one of the 5; 2 0 during the calculation, implying that one of the basic variables 
is zero. This can be a temporary effect, in which case the problem goes away at the next 
iteration, or it may be permanent, and that basic variable is indeed zero in the optimal 
solution. The best that may be said, other than going into sophisticated techniques 
found in specialist books on LP, is that problems are possible and the computation 
should be watched carefully. The solution can get into a cycle that cannot be broken. 

A second exception, that should be noted carefully, occurs when one of the c;= 0 for 
a non-basic variable in the optimal tableau. The normal simplex algorithm can then change 
the solution without changing the z row by selecting this 7 column at Step 1. Because 
c; = 0, Step 5 is never used on the z row at all. This case corresponds to a degenerate 
solution with many alternative solutions to the problem, and geometrically the profit 
function is parallel to one of the constraints. 


Example 10.4 


Solution 
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The third exception occurs at Step 2 when all the a, a, ... , a; in the optimal 
column are zero or negative and it becomes impossible to identify a row to continue the 
method. The region in this case is unbounded, and a careful look at the original problem 
is required to decide whether this is reasonable, since it may still be possible to get a 
solution to such a problem. 


Find the maximum of 
z= 5x, + 4x, + 6x, 
subject to 
4x, + x, x,— 19 
3x, + 4x, + 6x, < 30 
2x, +4x,+ x; S25 
XQ х0 +20 < 15 


ху, 0,23 2 0 


The example cannot be solved graphically, since it has three variables, but is in a correct 
form for the simplex algorithm. The initial tableau gives the solution x, = 19, х; = 30, 
Xs = 25, x, = 15 and non-basic variables x; = x, = x; = 0. 








Xi X; X3 X4 Xs х Xj; Solution 
z —5 -4 -6 0 0 0 0 0 Ratios 
X, 4 1 1 1 0 0 0 19 19/1 — 19 
X; 3 4 (6) 0 1 0 0 30 30/6= 5 
X¢ 2 4 1 0 0 1 0 25 25/1 = 25 
х; 1 1 2 0 0 0 1 15 15/22 7.5 





In the initial tableau the pivot is identified, and x; is removed from the basic variable 
column and replaced by x4. The pivot is made equal to unity by dividing the x, row by 
6. The other entries in the x; column are then made zero by the Gaussian elimination in 
Step 5. This gives the tableau 








х X; X, Xs Xg X; Solution 
z 0 0 0 1 0 0 30 Ratios 
xa : 0 1 E 0 0 14 4 
хз 2 1 0 d 0 0 5 10 
X; n 0 -l 1 0 20 13.3 
Xx, -! 0 -} 0 1 5 = 
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The process is then repeated and the pivot is again found, x, is replaced by x, in the 
first column, and the next tableau is constructed by following the remaining steps of the 
simplex algorithm, giving the tableau 








X, х, X3 X, Xs х X, Solution 
z 0 i 0 2 p 0 0 38 
x, 1 2 0 2 -4 0 0 4 
X 0 B 1 -i i 0 0 3 
d 0 g 0 - -2 1 0 14 
xy 0 -l 0 0 -! 0 1 5 





Thus the solution is now optimal, and gives x, = 4, x, = 0, x, = 3, and z = 38 as the 
maximum value. Note that the first two constraints are binding, that is satisfied exactly, 
while the other two are not. This can easily be deduced by looking at the slack variables 
in the initial tableau. We have x, =x; = 0, corresponding to the first two constraints, and 
X¢ #0, x, # 0 for the last two constraints. 


Computers are particularly helpful when there is an efficient algorithm, such as the 
simplex algorithm for solving LP problems, since they can perform the arithmetic 
with speed and accuracy. A typical implementation of the algorithm in MAPLE for 
Example 10.4 is now given: 

with (simplex) : 

(sionis censeo Л корер = (ур 

2*х1+А4*х2+=хз<=25,х1+х2-2*х3<=15}; 

Obg 2=5* x1 4 * xD Ык з 

maximize (obj,constr,NONNEGATIVE) ; 











These few lines of code give x, 2 4, x, 2 0 and x4 — 3 instantly. Similarly in MATLAB, 
LP problems can be solved but are set up in a slightly different way. It always solves 
the minimum problem 
Ax «b 
min,f7x such that + Aeq = beq 


Ib x x « ub 


апа the way that the problem is tackled can be controlled in opt imset. The follow- 
ing lines of code give the solution to Example 10.4: 


БУА ОЕ ЕТ Coe ee ШО ЕК ЕБ И Б 
Aeq-[ ]; beq-[ ]; lb=zeros(3,1); ub=[ ]; x0=[ ]; 
$[ ] indicates not used but the lower bound, lb, must be 
set to zero 
options-optimset('LargeScale','off','Simplex','on'); 
[x,fval,exitflag, output, lamda]=linprog(f,A,b,Aeq,beq,1b, 
ub,x0,options) 


Example 10.5 


Solution 
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Typing lamda.ineqlin gives the values 0.5714, 0.9048, 0, 0 which are the 
values in the z-row of the final tableau and determines whether or not the inequalities 
are binding. 

Clearly this is the quick way to get the ‘answer’, but it does not give any under- 
standing of the method. The package has the facilities to go through the steps of the 
algorithm one at a time so it can be used to help with the arithmetic while leaving 
the user to determine the steps of the method. 


A firm has two plants, P1 and P2, that can produce a particular chemical. The product 
is made from three constituents, A, B and C. In a given period there are 36 000 litres of 
A, 30000 litres of B and 12 000 litres of C available. Plant P1 requires the constituents 
A, B, C to be mixed in the ratio 4:2:1 respectively, and the manufacturer makes a profit 
of £1.50 per litre of product; plant P2 requires the ratio 3:3:1, and gives a profit of 
£1 per litre of product. 

Determine how production should be allocated to each plant to maximize the profits, 
and how much of A, B and C remain. 

There is a major breakdown in the supply of chemical C, so that only 8000 litres are 
available in the given period. How should production be changed to maximize the 
profits, how much has profit been reduced, and how much of A, B and C remain? 


For each 1000 litres produced in plant P1, * x 1000 will be constituent A, ? x 1000 


7 1 
will be B and 1 x 1000 will be C. For each 1000 litres produced in plant P2, 2 х 1000 
will be constituent A, ? x 1000 will be B апа : х 1000 уі be C. Thus, taking the three 
constituents in turn and letting x, and x, represent respectively the amount (in 1000 litre 


units) produced in plants P1 and P2, we obtain 
$x, + 2x, < 36 4x, + 3x, < 252 
$x, +3x,<30 or 2x, +3x, < 210 
1 


ix + 


2 x, « 12 X,+ x, 84 


1 

7 
ху, х 2 0 

and the profit 


Zz=1.5x,+x, 


We can immediately construct the initial tableau 








х, X3 X, Xs Solution 
2 -1 0 0 0 0 Ratios 
X3 3 1 0 0 252 63 
X4 3 0 1 0 210 105 
Xs 1 0 0 1 84 84 
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The pivot has been found, and hence we introduce x, into the basis and construct the 
next tableau following the steps of the simplex algorithm: 








x X; X3 X4 Xs Solution 
2 0 i а 0 0 94.5 
x 1 : i 0 0 63 
X4 0 3 - 1 1 0 84 
Xs 0 i =} 0 1 21 





The z row is all positive, and hence we can immediately read off the solution (multiply 
by 1000 to re-establish proper costs) 


x, = 63 000, х = 0, z — £94 500 


and only plant P1 is utilized. From the initial tableau we see that since x, — 0, there are 
zero litres of A remaining; x, = 84, so that we have (84/7) x 1000 = 12000 litres of B 
remaining; and x; = 21, so that (21/7) x 1000 = 3000 litres of C remain. 

After the breakdown, the 12 000 litres of C are reduced to 8000 litres, so that the first 
tableau becomes 











Xj X, X3 X, Xs Solution 
2 zs -1 0 0 0 0 Ratios 
X3 4 3 1 0 0 252 63 
X4 D 3 0 1 0 210 105 
I. (D 1 0 0 1 56 56 


We note that we have a different pivot, and hence we expect a different solution. The next 
tableau is derived in the usual way, giving 








Xi X, X3 X4 Xs Solution 
2 0 0.5 0 0 1.5 84 
X3 0 -1 1 0 -4 28 
X4 0 1 0 1 -2 98 
ži 1 1 0 0 1 56 





The tableau is again optimal, so 
x, = 56000, x, = 0, z — £84 000 


The profit is thus reduced by £10 500 by the breakdown, but still only plant P1 1s used. 
The remaining amounts of A, B and C can be checked to be 4000, 14 000 and zero litres 
respectively. 

Since this problem has only two variables, it would be instructive to check these 
results using the graphical method. 
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Example 10.6 


Solution 


Find the maximum of 
z — 4x, t 2x; 4x, 
subject to 
3x, +X) + 2x, S 320 
Xi X, х; = 100 


2x, + x, + 2x; S 200 








X, х Xs X, Xs Xg Solution 
2 -4 -2 -4 0 0 0 0 Ratios 
х4 3 1 2 1 0 0 320 160 
& 1 1 D 1 0 100 100 
X6 2 1 2 0 0 1 200 100 





Note that in the above tableau there is some arbitrariness in the choices in both Steps 1 
and 2. In Step 1 the column is chosen arbitrarily between the x, and x; columns. From 
the ratios, the x; row is selected from the x; and x, rows at Step 2, which both have equal 
ratios. Steps 3—5 are then followed to give the tableau 








И X; X3 X4 Xs Xg Solution 
2 0 2 0 0 4 0 400 
X4 1 -1 0 1 -2 0 120 
m 1 1 0 1 0 100 
% 0 -1 0 0 -2 1 0 





Although this is the optimal solution with x, = x, = 0, x; = 100 and z = 400, we have 
c, = 0 in the z row. Since x, is a non-basic variable, there is degeneracy. If we follow 
through the algorithm, choosing the first column at Step 1, we obtain an equally optimal 
solution in the following tableau. Replace x; by x, in the basic variables, and subtract 
the x, row from the x, row: 








Xi X) X3 X4 Xs X6 Solution 
2 0 2 0 0 4 0 400 
X4 -2 -1 1 -3 0 20 
X, 1 1 1 0 1 0 100 
Xe 0 -1 0 0 -2 1 0 





This solution gives x, 2 100, x, 2 x, 2 0 andz 2 400 once more. It can easily be deduced 
that x, 2 100(1 — 0), x2 0, x4 — 100a is an optimal solution for any Ox ox I 
with z = 400. We could have observed this fact geometrically, since z is just a multiple 
of the left-hand side of the last constraint. 
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10.2.4 Exercises 


Use the graphical method to find the maximum 
value of 


f 4x * 5y 
subject to 


3х + Ту © 10 
2х+ у< 3 
х,у>0 
Sketch the constraints 
2х- y € 6 
x+2y <8 
3x+2y < 18 


ya 


and verify that the maximum of the function 
x+y in the feasible region is at x = 4 and y — 2. 
Check the solution with the simplex method. 
Use a package such as MAPLE or MATLAB to 
verify the solution. 


A manufacturer produces two types of cupboard, 
which are constructed from chipboard and oak 
veneer that both come in standard widths. The first 
type requires 4m of chipboard and 5 m of oak 
veneer, takes 5h of labour to produce and gives a 
profit of £24 per unit. The second type requires 5m 
of chipboard and 2 m of oak veneer, takes 3h of 
labour to produce and gives a £12 profit per unit. 

On a weekly basis there are 400 m of chipboard 
available, 200 m of oak veneer and a maximum of 
250h of labour. Write this problem in a linear 
programming form. Use the simplex method to 
determine how many cupboards of each type should 
be made to maximize profits. How much profit is 
made? Which of the scarce resources remain 
unused? Show that the amount of oak veneer 
available can be reduced to 175 m without affecting 
the basis. What is the new solution, and by how 
much is the profit reduced? 


A factory manufactures nails and screws. The profit 
yield is 2p per kg nails and 3p per kg screws. Three 
units of labour are required to manufacture 1 kg nails 
and 6 units to make 1 kg screws. Twenty-four units 
of labour are available. Two units of raw material 
are needed to make 1 kg nails and 1 unit for 1 kg 
screws. Determine the manufacturing policy that 


yields maximum profit from 10 units of raw material. 


A manufacturer makes two types of cylinder, CYL1 
and CYL2. Three materials, M1, M2 and M3, are 
required for the manufacture of each cylinder. 
The following information is provided: 


Quantities of materials required 








MI M2 M3 
CYL1 1 1 2 
CYL2 5 2 1 


Quantities of materials available 





MI M2 M3 


45 21 24 


£4 profit is made on one CYL1 and £3 profit on 
one CYL2. How many of each cylinder should 
the manufacturer make in order to maximize 
profit? 


The Yorkshire Clothing Company makes two 
styles of jacket, the ‘York’ and the ‘Wetherby’. 
The York requires 3 m of cloth and 3 h of 

labour, and makes a profit of £25. The Wetherby 
needs 4 m of cloth and 2 h of labour, and makes 

a profit of £30. The Yorkshire has 400 m of 

cloth available and 300 h of labour available 

each week. Advise the company on the number 

of each style it should produce in order to maximize 
profits. 

The company is prepared to buy more cloth to 
increase its profits, but it will not employ any more 
labour. Under this revised policy, is there a strategy 
that will increase its profits? 


Find the optimal solution of the following LP 
problem: maximize 

z=kx, + 20x, 
subject to 


x, + 2x, = 20 





3x, + X, = 25 


X1,X, 20 


Example 10.7 


where k is a positive parameter representing 
variable profitability. Use both the simplex method 
and the graphical method, and interpret the results 
geometrically. 


Use the simplex method to solve the following 
problem: maximize 10 


2x, + xX, + 4x3 + x4 








subject to 
2x, *oxX x3 
x 3x -x,x-4 
4+ 5+0 % 3 








Xp X2 X3, X4 > 0 


A publisher has three books available for printing, 
B1, B2 and B3. The books require varying amounts 
of paper, and the total paper supplies are limited: 


Total units 11 


available 


B1 B2 B3 





Units of paper 3 2 1 60 
required per 
1000 copies 
Profit per 1000 
copies 


£900 £800 £300 
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Books B1 and B2 are similar in content, and the 
total combined market for these two books is 
estimated to be at most 15 000 copies. Determine 
how many copies of each book should be printed 
to maximize the overall profit. 


Euroflight is considering the purchase of new 
aircraft. Long-range aircraft cost £4 million each, 
medium-range £2 million each and short-range 

£1 million each, and Euroflight has £60 million 

to invest. The estimated profit from each type 

of aircraft is £0.4 million, £0.3 million and £0.15 
million respectively. The company has trained pilots 
for at most a total 25 aircraft. Maintenance facilities 
are limited to a maximum of the equivalent of 30 
short-range aircraft. Long-range aircraft need twice 
as much maintenance as short-range ones, and 
medium-range 1.5 times as much. Set this up as a 
linear programming problem, and solve it. Aircraft 
can only be bought in integer numbers, so estimate 
how many of each type should be bought. 


Find x,, X2, X3, X4 = 0 that maximize 


f= 6x, +x, + 2x; + 4x, 


subject to 
2x, + Xp + = 3 
Xi + җ+ ys 4 


xı +x + 3x3 + 2x4 S 10 


The previous section only dealt with ‘<’ constraints and did not consider ‘=’ constraints. 
These prove to be much more troublesome, since there is no obvious initial feasible 
solution, and Phase 1 of the two-phase method is solely concerned with getting such 
a solution. Once this has been obtained, we then move to Phase 2. This is the standard 
simplex method, starting from the solution just obtained from Phase 1. A simple example 
will illustrate the problems involved and the basic 1deas of the two-phase method. 


Find the maximum of 
2=х+у 
subject to 


—х+2у< 


IN 


6 
x 4 
2х+ у>4 


»у®0 
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Figure 10.5 
The feasible region. 


Solution 











The region defined by the constraints is shown in Figure 10.5. It is clear from the figure 
that the origin is not in the feasible region and that x — 4, y — 5 gives the optimal solution. 
We have already appreciated that the graphical method is only useful for two-dimensional 
problems, so we must explore how the simplex method copes with this problem. 

Add in the positive variables r, s and f to give 


—x+2y+r =6 
x +s =4 
2x+ y —t=4 


The r and s are the usual slack variables. Because we must subtract t to take away the 
surplus, f is called a surplus variable. The obvious solution x = y = 0, r = 6, s = 4, t = —4 
does not satisfy the condition that all variables be positive. The algebra is saying that 
the origin is not in the feasible region. Because the simplex method works so well, the 
last equation is forced into standard form by adding in yet another variable, u, called an 
artificial variable, to give 


2x+y-t+u=4 


Now we have a feasible solution x = y = t = 0, r = 6, s = 4, u = 4, but not to the problem 
we originally stated. As the term ‘artificial variable’ implies, we wish to get rid of u and 
then reduce the problem back to our original one at a feasible corner. The variable u 
can be eliminated by forcing it to zero and this can be done by entering Phase 1 with a 
new cost function 


, 
Z ——Hu 


We see that if we can maximize z' then this is at u — 0, and our Phase 1 will be complete. 
The simplex tableau for Phase 1 then takes the form 





x y r 5 t u Solution 
-1 -1 0 0 0 0 0 
i 0 0 0 1 0 
r - 2 1 0 0 0 6 
5 0 0 1 0 0 4 
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where z has been included for the elimination but does not enter the optimization: only 
the z’ row is considered in Phase 1. It may be observed that the tableau is not of standard 
form, since u is a basic variable and the (z’, u) entry is non-zero. This must be remedied 
by subtracting the wu row from the z’ row to give the standard-form tableau 








po y r s t u Solution 
2 =! =1 0 0 0 0 0 
4 m -1 0 0 1 0 -4 
r zi 2 1 0 0 6 
1 0 0 1 0 0 4 
Q2 1 0 0 =1 1 4 








Manipulation using the usual simplex algorithm gives the tableau 


x y r 5 t u Solution 





1 
2 
1 


кюле юр Nin 
Ne pie MI 





At this stage z’ = 0 and u = 0, so that we have driven u out of the problem, and the z’ 
row and u column can now be deleted. The Phase 1 solution gives x = 2, y = 0, which can 
be observed to be a corner of the feasible region in Figure 10.5. 

We now enter Phase 2, with the z' row and u column deleted, and perform the usual 
sequence of steps. The initial tableau is 


x y r 5 і Solution 








© 
Ш 
кюле юр 
= 
о 
| 
Nie NIE pe 


© 
| 











x y r 5 і Solution 
2 0 0 3 0 -i $ 





о 
| 
i= 


ie WIE wits 
Өх 
КАЕ 


о н 
| 
ity 








864 OPTIMIZATION 


Example 10.8 








x y r 5 і Solution 
2 0 0 7 0 9 
y : i 0 
t : 2 1 
х 0 1 0 








We now have an optimum solution, since all the z row entries are non-negative with x — 4, 
у = 5 and objective function z 2 9 in agreement with the graphical solution. 


The general two-phase strategy is then as follows: 


Phase 1 


(a) Introduce slack and surplus variables. 
(b) Introduce artificial variables alongside the surplus variables, say x,, ... , x. 
(c) Write the artificial cost function 


VEM — = 
Иа аА, 


(d)  Subtract rows x,, x4, . . . , x, from the cost function z’ to ensure there are zeros 
in the entries in the z’ row corresponding to the basic variables. 

(c) Use the standard simplex method to maximize z’ (keeping the z row as an extra 
row) until z’ = 0 and 


0) 
Рһаѕе 2 
(a) Eliminate the z row and artificial columns х,,..., х,. 


(b) Use the standard simplex method to maximize the objective function z. 


There are other approaches to obtaining an initial feasible basic solution, but Phase 1 of the 
two-phase method gives an efficient way of obtaining a starting point. Geometrically, 
it uses the simplex method to search the non-feasible vertices until it is driven to a 
vertex in the feasible region. 


Use the two-phase method to solve the following LP problem: maximize 
z=4x + ix + xX 
subject to 
xX, + 2x, + 3x; 22 
2x,+ + 0% 5 


Xp X2, X3 20 
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Solution 
Phase I Introduce a surplus variable x, and a corresponding artificial variable x; into the first 
inequality. A slack variable x, is required for the second inequality. The artificial cost is just 
2 = —х; 
and we can construct the initial tableau 
Xi Xo X3 X4 Xs X6 Solution 
z -4 - -1 0 
z' 0 0 0 1 
Xs 1 2 3 -1 1 0 2 
х 2 1 1 0 0 1 5 
We subtract the x; row from the z’ row to eliminate the 1 from the (z’, x;) element, 
giving the tableau 
xy Xo X3 X4 Xs X6 Solution 
z =й -1 =j 0 0 
z' -i —2 =3 1 —2 
E 1 2 (3) = 1 0 2 
X¢ 2 1 1 0 0 1 5 
We now apply the steps of the simplex algorithm to give the tableau 
X, X, X3 X, Xs Xg Solution 
e| d 4o d 
г! 0 0 0 1 0 
| орооч ооа о | 
1 1 1 13 
Xs : | 0 з Е: І z 
Since z’ = 0 and the artificial variable x; has been driven into the non-basic variables, 
phase 1 ends. 
Phase 2  Thez'row and the x, column are now deleted, and the following sequence of tableaux 


constructed following the rules of the simplex algorithm: 














Xi X2 X3 X4 X6 Solution 
=u 1 -L 2 
2 3 6 0 3 0 3 
2 -— 2 
© eee 
5 1 1 13 
х 3 3 0 3 1 з 
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х X, X3 X4 X6 Solution 
z 0 т 11 -4 0 8 
Xi 1 2 3 =1 0 














X х) X3 X4 Xg Solution 
2 0 3 1 0 2 10 
1 1 1 5 
x 1 2 2 0 2 2 
3 5 1 1 
х4 0 -3 -3 1 2 2 


The solution is now optimal, with x, — 3, xX, = x; = 0 and z = 10. Note that the first 
inequality is not binding, since x, # 0, while the second inequality is binding. 


As indicated in the previous section a computer package such as MAPLE or MATLAB 
can deal easily with LP problems. For ‘>=’ inequalities the packages are equally efficient 
and will produce the answer, but there is no indication that the two-phase method has 
been used. For Example 10.8 the MAPLE code 


with(simplex) : 
constr:eixlc2*x243*xJ39-2,2*xlox24XxX3€-2555; 
Ор E E52 Ex 
maximize(obj,constr,NONNEGATIVE); 











produces the result x, = 5/2, x, = 0 and x, = 0 instantly. 
The corresponding MATLAB code is 


ЕАО Е ЕСЕР =392 1 ГЕ о ЕРЫ 

Аеа=[ ]; beq=[ ]; lb=zeros(3,1); ub=[ ];x0=[ ]; 

options-optimset('LargeScale','off','Simplex','on'); 

[x,fval,exitflag, output, lamda]=linprog(f,A,b,Aeq,beq,1b, 
ub,x0,options) 


Three ores, A, B and C, are blended to form 100 kg of alloy; the percentage contents and 
the costs are as follows: 





Ore A B C 

Iron 70 60 0 
Lead 20 10 40 
Copper 10 30 60 
Cost (£ kg’) 3000 2000 1000 





The alloy must contain at least 20% iron, at least 25% lead but less than 48% copper. 
Find the blend of ores that minimizes the cost of the alloy. 


Solution 
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Let x,, x, and x, be the weights (kg) of ores A, B and C respectively in the 100 kg of alloy. 
The constraints give 


iron 0.7x, + 0.6x, = 20 
lead 0.2x, + 0.1x, + 0.4x; = 25 
copper 0.1x, + 0.3x, + 0.6x; < 48 
and to make the 100 kg of alloy, 
xX, +X, +x; = 100 
The cost is 
3000x, + 2000x, + 1000x, 


which is to be minimized. 
To reduce the problem to standard form, we change the problem to a maximization of 


z = —3000x, — 2000x, — 1000x; 


For the inequality constraints we use surplus variables x, and x; and a slack variable x,. 
We require two artificial variables x, and x, alongside the surplus variables. Thus the 
inequalities become 


0.7x, + 0.6x, — X4 tX -20 
0.2x, t 0.1x, 4 0.4x, — X; +X, = 25 
0.1x, + 0.3x, + 0.6x, T XS -48 


To deal with the equality constraint, we introduce a further artificial variable xo: 
X, T X; X4 9 x — 100 


We must drive x, to zero, to ensure that the equality holds, so it is essential to put x, 
into the artificial cost function. (Note that this is the standard way of dealing with an 
equality constraint.) We first enter Phase 1. Steps (a)—(c) of Phase | of the two-phase 
method give the initial tableau 











Xx, X; X3 X4 Xs Xg X; Xg ху Solution 
2 3000 2000 1000 0 0 0 0 0 0 0 
z' 0 0 0 0 0 0 1 1 1 0 
X; 0.7 0.6 0 -1 0 1 0 0 20 
Xg 0.2 0.1 0.4 0 -1 0 0 1 0 25 
х; 0.1 0.3 0.6 0 0 1 0 0 0 48 
% 1 1 1 0 0 0 0 0 1 100 





It is necessary to remove the 1s from the z' row in the basic variable columns x;, x, and 
xs. Following (d) of the general strategy, we replace the z^ row by (z' row) — (x; row) — 
(xg row) — (x, row) to give the tableau 
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Xi х Xs X Xs Xg X; Xs Xo Solution 
2 3000 2000 1000 0 0 0 0 0 0 0 
z -1.9 -1.7 -14 1 1 0 0 0 0 —145 
х7 0.7 0.6 0 -1 0 1 0 0 20 
Xg 0.2 0.1 0.4 0 -1 0 0 1 0 25 
х 0.1 0.3 0.6 0 0 1 0 0 0 48 
% 1 1 1 0 0 0 0 0 1 100 








Several tableaux need to be completed to drive z’ to zero and complete Phase 1, with 








the final tableau being 
Solution 
0  -2000 0 0  Á-10000 0 0 10000 —1000 —250 000 
z | 0 0 0 0 0 0 1 1 1 0 
x, | 1 15 0 0 5 0 0 = 2 75 
x10 —0.5 1 0 = 0 0 5 -1 25 
% |0 045 0 0 25- 1 0 -2.5 0.4 25.5 
x, | 0 045 0 1 35 0 4 =3:9 1.4 32.5 





Removing the artificial variables and the z' row gives the tableau 











xi X» X3 X4 Xs х Solution 
2 0 —2000 0 0 —10 000 0 —250 000 
Xx, 1 1.5 0 0 5 0 75 
X3 0 —0.5 1 0 —5 0 25 
% 0 0.45 0 0 2.5 1 25:5 
X4 0 0.45 0 1 3.5 0 32:5 





The algorithm is now ready for Phase 2, since a sensible feasible basic solution is 
available. The standard procedure leads, after many cycles, to the final tableau 





Solution 
2 —140 000 
X4 4 
X3 60 
X» 40 
Xs 3 





and the solution can be read off as 


xy-0,; x, = 40, x; = 60 


13 


14 
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with the cost minimized at £140 000. (Note that the cost in the tableau is negative, since 
the original problem is a minimization problem.) It may be noted that x, = 0, so the 
copper constraint is binding while the iron constraint gives 4% more than the required 
minimum and the lead constraint gives 3% more than the required minimum. 


Except for simple, illustrative examples, the amount of computational work in the two- 
phase strategy is heavy and requires the use of a computer package. Even for the com- 
paratively simple Example 10.9, many tableaux were required in the solution. The computer 
packages MAPLE and MATLAB have no difficulty in dealing with the equality constraint 


that appears in the problem. 


10.2.6 Exercises 


Use the graphical approach to solve the LP problem 
max(x + 2y) 
subject to the constraints 
1<yx<4 
x+yS5 
Check your solution using the two-phase method 
and by using a MAPLE or MATLAB implementation. 
Use the simplex method to find positive values of x, 
and x, that minimize 
f= 10x, +x, 
subject to 
4x, +x, 32 
2x, +x, = 12 
2X,-X,= 4 
—2x,+%S 8 


Sketch the points obtained by the simplex method 
on a graph, indicating how the points progress 
through Phases 1 and 2 to the solution. 


The Footsie company produces boots and shoes. 
If no boots are made, the company can produce a 
maximum of 250 pairs of shoes in a day. Each pair 
of boots takes twice as long to make as each pair of 
shoes. The maximum daily sales of boots and shoes 
are 200, but 25 pairs of boots must be produced to 
satisfy an important customer. The profits per pair 
of boots and shoes are £8 and £5 respectively. 


15 


16 


ПЛ 


18 


Determine the daily production plan to maximize 
profits. Use the two-phase method to obtain the 
solution, and verify your result with a graphical 
solution. 


In Exercise 9 there is an additional union agreement 
that at least 50 000 books must be printed. Does the 
solution change? If so, calculate the new optimum 
strategy. 


Solve the LP problem 
max(x + y + z) 
subject to the constraints 
x21 
х+2у = 3 
у+32=& 4 


by using the two-phase method. Check your result 
using MAPLE or MATLAB. 


Solve the following LP problem: minimize 
2х1 + 75+ 4х; + 5х4 
subject to 
xj,—-X4—x,20 
Xp +X, = 2 
Xis Xz Xy X4 77 0 


A trucking company requires antifreeze that 
contains at least 50% of pure glycol and at least 
5% of anticorrosive additive. The company can 
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buy three products, A, B and C, whose constituents 
and costs are as follows: 





A B C 
% glycol 65 25 80 
% additive 10 3 0 


Cost (£/litre) 1.8 0.9 1.5 


What blend will provide the required antifreeze 
solution at minimum cost? What is the cost of 
100 litres of solution? 


A builder is constructing three different styles of 
house on an estate, and is deciding which styles 

to erect in the next phase of building. There are 
40 plots of equal size, and the different styles 
require 1, 2 and 2 plots respectively. The builder 
anticipates shortages of two materials, and estimates 
the requirements and supplies (in appropriate 
units) to be as follows: 





Requirements 
Total 
Style 1  Style2 Style3 supply 
Facing 1 2 5 58 
stone 
Weather 3 2 1 72 
boarding 


The local authority insists that there be at least 
5 more houses of style 2 than style 1. If the profits 
on the houses are £1000, £1500 and £2500 for 


Lagrange multipliers 


10.3.1 Equality constraints 


20 


styles 1, 2 and 3 respectively, find how many of 
each style the builder should construct to maximize 
the total profit. 


A manufacturer produces three types of carpeting, 
Cl, C2 and C3. Two of the raw materials, M1 

and M2, are in short supply. The following table 
gives the supplies of M1 and M2 available (in 
1000s of kg), the quantities of M1 and M2 required 
for each 1000 m? of carpet, and the profits made 
from each type of carpet (in £1000s per 1000 m?): 





Quantities 
required 
MI M2 Profit 
CI 1 1 2 
С2 1 1 3 
C3 1 0 0 
Quantities available 5 4 


Carpet of type C3 is non-profit-making, but is 
included in the range in order to enable the 
company to satisfy its customers. The company 
has policies that require that, if x,, x, and x; 1000s 
of m? of C1, C2 and C3 respectively are made then 


x,21 
and 
хрх +23 22 2 


How much carpet of each type should the company 
manufacture in order to satisfy the constraints and 
maximize profits? 


In Section 10.2 we looked at the situation where all the functions were linear. As soon 
as functions become nonlinear, the problems become very much more difficult. This 
is generally the case in most of mathematics, and is certainly true in optimization 


problems. 


Figure 10.6 Lagrange 
multiplier problem. 


Example 10.10 
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Contours of 
f(x. y) « constant 


In Section 9.7.4 of Modern Engineering Mathematics it was shown how to use Lagrange 
multipliers to solve the problem of the optimization of a nonlinear function of many 
variables subject to equality constraints. For the general problem it was shown that the 
necessary conditions for the extremum of 

Ло, х... x,) 
subject to 


е0, 5, ...,5,)= 0 (i2L2,..., m(m « n) 


are 
OF д.91 8,082... + д, Bm a0 (k21,2,...,n) (10.7) 
9х, Ox, OX, Ox, 

where A,, A,,..., A,, are the Lagrange multipliers. These n equations must be solved 


together with the m constraints g; = 0. Thus there are n + m equations іп the n + m 
unknowns x,,%),...,X, and A,,A,,..., A, 
For a two-variable problem of finding the extremum of 


f(x,y) subject to the single constraint g(x, y)=0 


the problem is illustrated geometrically in Figure 10.6. We are looking for the 
maximum, say, of the function f(x, y), but only considering those points in the plane 
that lie on the curve g(x, y) = 0. The mathematical conditions look much simpler of 
course, as 


/,+ Ав,=0 
f, * Ag, -0 (10.8) 
g-0 


There are two comments that should be noted. First, the method fails if g, = g, = 0 
at the solution point. Such points are called singular points: fortunately they are rare, 
but their existence should be noted. Secondly, sufficient conditions for a maximum, a 
minimum or a saddle point can be derived, but they are difficult to apply and not very 
useful. Example 10.12 will illustrate an intuitive approach to sufficiency. 

A few examples should be enough to remind the reader of the problems involved and 
to show the techniques required to solve the equations. 


Fermat stated in 1661 that ‘Light travels along the shortest path’. Find the path that 
joins the eye to an object when they are in separate media (Figure 10.7). 


872 OPTIMIZATION 


Figure 10.7 Fermat's 
shortest-path problem. 


Solution 


Example 10.11 


Solution 


g ;, Medium 1 


' 

a, velocity of light =v 
1 
a 





b 
в 
Мейит 2 iB i 
velocity of light = V 1 


= 


The velocities of light in the two media are v and V. The time of transit of light is given 
by the geometry as 
e eon: b 
veosa Vcos B 








and is then subject to the geometrical constraint that 
L-atano 4 btan f 


Applying (10.8), 


0227,59 .a seca tan + Aa seca 
да да о 


0 = ae ase = 2 see B tan B Ab sec! 
These give as the only solution 
sin & — —Àv, sin b = -AV 


or 





sinc v. 
snp V 


which is known as Snell's law. 


u 


In Example 10.10, to obtain Snell's law, the solution of the equation was quite straight- 
forward, but it 1s rarely so easy. Frequently it is technically the most difficult task, and 
it is easy to miss solutions. We return to the ‘milk carton’ problem discussed earlier to 
illustrate the point. 


Find the minimum area of the milk carton problem stated in Example 10.2 and illustrated 
in Figure 10.1. 


Taking measurements in millimetres, the basic mathematical problem is to minimize 
A — (2b * 2w * S)(h * b 4 10) 
subject to 


hbw = 1 136000 


Example 10.12 


Solution 





Figure 10.8 Hopper. 
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Applying the Lagrange multiplier equations (10.7), we obtain 
0 2 (2b * 2w * 5) + Abw 
0 2 (2b 4 2w+5)+2(h+6+ 10)+Ahw 
0 = 2(h+b+10)+Ahb 


giving four equations in the four unknowns h, b, w and A. If we eliminate A and w from 
these equations, we are left with the same equations that we derived in Example 10.2, 
which have no simple analytical solutions. 


The only way to proceed further with Example 10.11 is by a numerical solution. Thus, 
even with simple problems such as this one, we need a numerical algorithm, and in most 
realistic problems in science and engineering we encounter similar severe computa- 
tional difficulties. It is often a problem even to write down the function or the constraints 
explicitly. Such functions frequently emerge as numbers from a complicated computer 
program. It is therefore essential to look for efficient numerical algorithms to optimize 
such functions. This will be the substance of Section 10.4. 

A final example shows the situation where there are more than two variables involved. 
An indication will be given of the difficulties of proving that the point obtained is a 
maximum. 


A hopper is to be made from a cylindrical portion connected to a conical portion as 
indicated in Figure 10.8. It is required to find the maximum volume subject to a given 
surface area. 
We can compute the volume of the hopper as 

И = п + і ЛА tana 
and find its maximum subject to the surface area being given as 

A 2 2xRL ^ x sec d 


Applying (10.7) with the appropriate variables, we obtain 


0- ae Ag = 2nRL + WR’ tana + А(2л/, + 27Ё sec о) 
0- 9V , 49g = дА? + А2лЁ 

oL OL 
jue 108. i mR’ sec’ + ATR? sec atan œ 

da да 


First, A can be easily evaluated as A = —}R. The last of the above three equations 
becomes 


TR’ sec’a(+ R + Asin œ) = 0 
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and hence sin œ= Ẹ. Since 0 < œ « 1m, we have ot — 0.730 rad or 41.8?. A little further 
algebraic manipulation between the constraints and the above equations gives R? = A/r,5, 
so that R — 0.3774? and V 2 0.1264", 

Normally it would be assumed that this is a maximum for the volume on intuitive 
or geometrical grounds. To prove this rigorously, we take small variations around the 
suspected maximum and show that the volume is larger than all its neighbours. For 
simplicity let R* = (A/m5)(1+ 6), sec a = (3//5)(1 + €), evaluate L from the constraint 
g — 0 and then calculate V to second order in € and ó by Taylor's theorem. Some careful 
algebra gives 


3/2 
А 


re 21251461 Е 


192 92 

0 — ©) 

This shows that for any non-zero values of € and ô we obtain a smaller volume, and 
hence we have proved that we have found a maximum. 


It should be reiterated that the major problem lies in solving the Lagrange multiplier 
equations and not in writing them down. This is typical, and supports the need for good 
numerical algorithms to solve such problems; they only have to be marginally more 
difficult than Example 10.12 to become impossible to manipulate analytically. 


Inequality constraints 


Although we do not intend to consider them in any detail here, for reference we shall 
state the basic extension of (10.7) to the case of inequality constraints. Kuhn and Tucker 
proved the following result in the 1940s. 

To maximize the function 





fa; sos 235) 
subject to 
gx... x) m0 (mL... m) 
the equivalent conditions to (10.7) are 
Jf. 198. Bn ig (=I 
Ox; "Ox, eens " dx, ( prisa) 
Ле, = 0 
20 (= 1,..., т) 
2: = 0 


The equation A;g; — 0 gives two alternative conclusions for each constraint, either 


(а) g;=0, in which case the constraint is ‘active’ and the corresponding A; > 0, or 


(b) A; =0 and g; < 0, so that the optimum is away from this constraint and the 
Lagrange multiplier is not necessary. 


Implementation of the Kuhn—Tucker result is not very easy, even though in prin- 
ciple it looks straightforward. There are so many cases to check that it becomes very 
susceptible to error. 


2l 


22 


23 


24 


25 


26 


27 


E] 


10.3.3 Exercises 


Find the optimum of f= x? + xy + y” subject to the 
constraint x + y = | using Lagrange multpliers. 
Show that the optimum is a minimum. 


Find the shortest distance from the origin to the ellipse 


2 


+4 =1 (a<b) 


Ss [Г 


X 
2 
a 


Determine the lengths of the sides of a rectangle 
with maximum area that can be inscribed within 
the ellipse 


кю 


E = 
i 
i 


+ 


БЫ: 
n 


Find the optimum of f xy?z subject to x + 2y + 3z 
= 6 using a Lagrange multiplier method. 


Show that the stationary points of f= x? + y? + 2° 
subject to x + y — z = 0 and yz + 2zx — 2xy = 1 are 
given by the solution of the equations 


0 =2x + + (2z — 2y)u 
022y-AÀ-(z-2x) 





0=22—-А+(у+2х)и 


Add the last of these two equations, and show 
that either u = —2 or y + z = 0. Hence deduce the 28 
stationary points. 


A rectangular box without a lid is to be made. 

It is required to maximize the volume for a given 
surface area. Find the dimensions of the box when 
the total surface area is A. 


(Harder) The lowest frequency of vibration, 
a, of an elastic plate can be computed by 
minimizing 


Hill climbing 


10.4.1 Single-variable search 
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Цо] = || (Va)? dx dy 


R 


subject to 


[Jose 


R 


over all functions @(x, y), where R is the region 
of the plate in the (x, y) plane. If R is the square 
region |x| « 1, |v| « 1 and the plate is clamped at 
its edges, use the approximation 
@ = Асов? ілх соѕ? і лу 
4 


to show that J[@pin] = 0 = im : 
Use the improved approximation 


Sagal 21 1 1 
€) — c0s^5 tx cos^5 ty (A -- B cos 5 x cos 5 Лу) 


to get a better estimate of o. 

(Note: f*, cos" 1 mz dz — (2n)V/(n'y 2?" for 
non-negative integers n. Preferably use an algebraic 
symbolic manipulator, for example MAPLE, to 
evaluate the differentials and integrals.) 


Use the Kuhn- Tucker criteria to find the minimum of 
2х2 + х2 + 2х 

subject to 
Xi—-X,« & 


where @ is a parameter. Find the critical value 
of a at which the nature of the solution changes. 
Sketch the situation geometrically to illustrate 
the change. 


Most practical problems give calculations that cannot be performed explicitly, and need 
a numerical technique. Typical cases are those in Examples 10.1 and 10.11, where the 
final equations cannot be solved analytically and we need to resort to numerical methods. 
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Figure 10.9 
Bracketing procedure. 


This is not an uncommon situation, and hill climbing methods were devised, mainly in 
the 1960s, to cope with just such problems. 

In many engineering problems the functions that we are trying to optimize cannot be 
written down explicitly. Take for example a vibrational problem where the frequencies 
of vibration are calculated from an eigenvalue problem. These frequencies will depend 
on the parameters of the physical system, and it may be necessary to make the largest 
frequency as low as possible. To illustrate this idea, suppose that the eigenvalues come 
from the equation 


a-À -l 0 
-1 -A -1 |=0 
0 =i aA 


where there is just one parameter a. In this case the mathematical problem is to find 
min (Anax). We note that there is no explicit formula for Anax as a simple function ofa, 


and our function is the result of a solution of the determinantal equation. For some values 
of a the function can be calculated easily, 444,71) — 43, Amax(0) = (2 and Amal!) = 2, 
but any other value requires a considerable amount of work. However, a computer 
package such as MATLAB will perform this hard work very quickly with the instruction 
lamda - max(eig([a -1 0; -1 0 -1; 0 -1 a^2])). Calculating the derivative 
of the function with respect to a 1s too difficult even to contemplate. 

In the determinant example we have a function of a single variable, but in most problems 
there are many variables. One of the commonest methods of attack is to obtain the 
maximum or minimum as a sequence of single-variable searches. We choose a direction 
and search in this direction until we have found the optimum of the function in the chosen 
direction. We then select a new direction and repeat the process. For this to be a successful 
method, we need to be able to perform single-variable searches very efficiently. This 
section therefore deals with single-variable problems, and only then in Section 10.4.3 
are multivariable techniques discussed. In deciding on a strategy for solution, one crucial 
point is whether derivatives can or cannot be calculated. In the eigenvalue problem, 
calculation of the derivative is difficult, and would probably not be attempted. If the 
derivative can be obtained, however, more information is available, and any numerical 
method can be speeded up considerably. With the increase in sophistication of computers, 
this is becoming a less important consideration, since a good numerical approximation to 
the derivative 1s usually quite satisfactory. This is certainly the case in the MATLAB 
routine fminunc. 

The basic problem is to determine the maximum of a function y = f(x) that is difficult 
to evaluate and for which the derivative may or may not be available. The task is per- 
formed in two stages: in Stage 1 we bracket the maximum by obtaining x, and x, such 
that x, S Xmax S X2 as described in Figures 10.9 and 10.10, and in Stage 2 we devise 
a method that iterates to the maximum to any desired accuracy, as in Figures 10.11 
and 10.12. 





s 4h 8h 16h 


О аа а; а а as 





хү 
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dich Aen de for (a) (Derivative not known} | 
obtaining a bracket for If n. ae is defined by the anonymous function 
: x) 

the maximum 6f f0); then the following code 
aold= -1;h=.01;nmax=10; % aold, h, nmax are specified at values appropriate to the problem 
zold=f(aold);a=aold+h;h=2*h;z=f(a);n=0;% step 1 
while (z>zold)&(n<nmax) 

n=n+1;h=2*h;aoldold=aold;aold=a;a=a+h;zoldold=zold;zold=z;z=f(a); % subsequent steps 
end 
provides the bracket [aoldold,aold,a]. 











(b) {Derivative known} 

If the function and its derivative are defined by the anonymous functions 
COC 
fdash=@(x) ..... 

then the following code 
a=.01;h=.01;nmax=10; 
zold=f(a);zdashold=fdash(a);aold=a;a=a+h;z=f(a);zdash=fdash(a);h=2*h;n=0; 
while (zdash>0) & (n<nmax) 

n=n+1;zold=z;zdashold=zdash;aold=a;z=f(a);zdash=fdash(a);a=at+h;h=2*h; 

end 

provides the bracket [aold,a]. 











A bracket is most quickly achieved by starting at a given point, taking a step in the 
increasing direction and proceeding in this direction, doubling the step length at each step 
until the bracket is obtained (Figure 10.9). The basic idea is summarized in Figure 10.10, 
which gives a MATLAB procedure for this technique. It is written as a ‘stand-alone’ 
segment and can be adapted for other packages, but it would normally be incorporated in 
a more general program. It is assumed that sensible values for a and / have been chosen, 
but in a working program great care has to be taken. A great deal of effort is required to 
cope with inappropriate choices and to prevent the program aborting. 

The algorithm works very efficiently provided that appropriate safeguards are included, 
but it is not foolproof. The maximum number of steps chosen, nmax, is usually 10, and 
the initial value of ^ is small compared with the overall dimension of the problem under 
consideration. 


Example 10.13 Find a bracket for the first maximum of f(x) =x sin x using the algorithm in Figure 10.10. 


Solution Choose a= 0.01 and h = 0.01; the algorithm then gives 


a 0.01 0.02 0.04 0.08 0.16 0.32 0.64 1.28 2.56 5.12 





f | 0.000 0.000 0.002 0.006 0.025 0.101 0.382 1.226 1.06  —4.701 
f' | 0.020 0.040 0.080 0160 0.317 0.618 1.111 1.325 —1.590 - 





If the derivative is not used then the bracket is 1.28 « x « 5.12. 
If the derivative is used then the bracket is 1.28 « x « 2.56. 
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Figure 10.11 
(a) MATLAB [=] 


file gapp.m for 

the quadratic 
approximation 
algorithm; the function 
segment for f(x) 

is declared in the 

file fnn.m; (b) diagrams 
corresponding to the 
four cases considered 
in the program. 


Figure 10.12 
(a) MATLAB (m) 


file cufit.m for the 
cubic approximation 
algorithm for the 
maximum of f(x); 

x and the function 
segments f(x) and 
fdash(x) are declared 
in the file cub.m; 

(b) diagrams 
corresponding to the 
two cases considered 
in the program. 


Old labels Xi x* X; 


New labels X X3 X, х 


function [x,f]=qapp(a,b) 
% a=[al,a2,a3] is the input vector of three points from bracketing 
% b=[f(al),f(a2),f(a3)] is the vector of function values 
x=a;f=b;p=polyfit(a,b,2); 
xstar=-0.5*p(2)/p(1);fstar=fnn(xstar); 
if fstar>b(2) 
if xstar<a(2), x(3)-a(2);f(3)-b(2); 
else x(1)=a(2);f(1)=b(2); end 
x(2)=xstar;f(2)=fstar; 
else 
if xstar<a(2), x(1)=xstar;f(1)=fstar; 
else x(3)=xstar;f(3)=fstar; end 








end 
% x contains the three points of the new bracket and f the function values 
(a) 
в , 
a a 
ЫШ a 
b 1 
b a 
b 1 
b a 
ЫШ i 
F a 
Old labels XQ x* x, Xa Xi X) xX* д 
New labels Х| ХХ X3 yA X3 
1 b 
i ЫШ 
a в 
t E 
i B 
1 a 
Li a 
Old labels Х| Х® x, X3 X X; X* д 
New labels Xp X X X, Ху Хх 
(6) 


function [an,bn]=cufit(a,b) 

% a=[x1 f(x1) fdash(x1)] and b=[x2 f(x2) fdash(x2)] are the input vectors 
v=[a(2);b(2);a(3);b(3)]; 

A-[a(1)^3 a(1)^2 a(1) L;b(1)^3 b(1)^2 b(1) 1;3*a(1)^2 2*a(1) 1 0;:3*b(1)^2 2*b(1) 1 0]; 
p=A\v;xstar=(-p(2)-sqrt(p(2)*2-3*p(1)*p(3)))/(3 *p(1));e=cub(xstar); 

if c(3)>0 an=c;bn=b; else bn=c;an=a;end 

% an and bn contain the new bracket vectors 


(a) 





Example 10.14 


Solution 
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Stage 2 of the calculation is to use the bracket just obtained and then iterate to an 
accurate maximum. A simple and efficient approach is to use a polynomial approxima- 
tion to estimate the maximum and then choose the ‘best’ points to repeat the 
calculation. 

If it is assumed that no derivative is available then a bracket is known from the 
algorithm of Figure 10.10, so that x, x;, x, and the corresponding fi, fa, J, with fi < f 
and f, < f, are given. The quadratic polynomial through these points can be written 
down immediately: it is just the Lagrange interpolation formula that was discussed 
in Section 2.3 of Modern Engineering Mathematics. It can easily be checked that the 
quadratic which passes through the required points is given by 


Jaja OU m) gr, рео y (10.9) 


(x3 7 x1) 605 7 x5) i Qa 7 x3) Gr 7 x3) 


+ (x - x3)(x - xi) > 
(x; = x3)(x2 - x1) 


Then F’ = 0 at the point x*, which is given, after a little algebra, by 


a 0$ xD fi G6 -xD Ai G0 X) 


E 2[(x2 = хз) Л + (x3 - x1) f2 + (x1 - x2) fz] 


A MATLAB procedure for the algorithm that uses this new x* and /* is given in 
Figure 10.11, where the best three values are chosen for the next iteration. The code can 
be easily adapted for other packages. It can be re-written in terms of anonymous func- 
tions as in Figure 10.10 but there is great merit in breaking the program into small units 
that can be checked independently; these are M-files in MATLAB as in Figure 10.11. 

The method works exceptionally well by repeating the instruction [ a, b] =qapp( a, b) , 
but again it is not totally foolproof, and remedial checks need to be put into a working 
program. The stopping criterion is very problem-dependent, and requires thought and 
numerical experimentation. The MATLAB procedure fminbnd uses the quadratic 
approximation method (for the minimum problem). It only requires a bound for the 
minimum and uses another method, the Golden section (see Exercise 34), to obtain 
three starting values. It then proceeds similarly to the current algorithm. The two lines 
of code below solve the eigenvalue problem posed at the start of this section 


(10.10) 


options=optimset (‘display’,’iter’); 
[x, fval]=fminbnd(‘max(eig([x -1 0;-1 0 -1;0 -1 x%*2]))’, 
-1,1,options) 


It is worth looking at the full code of the MATLAB procedure for £minbnd to appre- 
ciate the enormous effort required to automate the problem fully, to deal with errors and 
failures and to make the progam *user friendly'. 


Find the first maximum of f(x) = x sin x given the values from Example 10.13, namely 
xı = 1.28, x, = 2.56, x3 = 5.12, fi = 1.226, fı = 1.406 and f; = —4.701. 
From (10.10), x* = 2.03 and f* = 1.820, so for the next iteration choose 

x, = 1.28, х = 2.03, x, = 2.56 

fi 1.226, /[,-1.820,  f,-1.406 
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Example 10.15 


Solution 


From (10.10), x* = 1.98 and f* = 1.816, so for the next iteration choose 
x, = 1.98, х = 2.03, X3 = 2.56 
А = 1.816, = 1.820, f,=1.406 
From (10.10), x* = 2.027 and f* = 1.820, so the method has almost converged. 


When the derivative is available, a better approximating polynomial than (10.9) 
can be used, since x, fi, fj (> 0), x2, 5, and f; (« 0) are known from the bracketing 
algorithm, and this data can be fitted to a cubic polynomial 


/ = Е= ах + Б? + сх+ а 
F’ = Зах? + 2х + с 


In this case fitting the values just gives the matrix equation 


fi xi x; x, lla 
3 2 
Һ|_| x» x x lib (10.11) 
Л 3х? 2x, 1 0 C 
fal (3x 2x, 1 Ojla 


and the maximum of F is given by F’ = 0, so that 


2 1/2 
x*- Б + (6 - Зас) _ (10.12) 
За 


and the negative sign 1s chosen (the positive sign for a minimum problem) to ensure 
that x, «€ x* < х,. 

A simple algorithm uses these results to choose the appropriate bracket for the 
next iteration. The algorithm is illustrated in Figure 10.12, and is an efficient iterative 
way of evaluating the maximum; just repeat the instruction [a,b]=cufit (a,b). 
Unfortunately it is very easy to make errors in a hand computation, and many people 
prefer the quadratic algorithm for this purpose. On a computer, however, the cubic 
approximation method is almost universally used. 


Find the first maximum of f(x) = x sin x given the bracket values from Example 10.13, 
namely x; — 1.28, f; — 1.226, f — 1.325, and x; — 2.56, f; — 1.406, /; — —1.590. 


Solving (10.11) gives a 2 —0.3333, b 2 0.7814 and c 2 0.9630, and hence, from (10.12), 
х* = 2.036, /* = 1.820, f*' ——0.0192 


Thus x;, f, and /5 are replaced by x*, f* and f*’, and x, f, and f are retained. 
Equation (10.11) is now recomputed and solved to give 


a4--—0.4626, b-1.412, с=—0.0153 
Hence, from (10.12), 
x* 22.029, f* —]1.820, f*'--—0.0007 


29 


30 
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Further iterations may be performed to get an even more accurate value. Comparison 
with Example 10.14 shows that both the quadratic and cubic algorithms work well for 


this function. 


(m) There are ‘built in’ maximizing routines included in most computer packages 
which can deal with most functions that arise in engineering computations. In MATLAB 
the single instruction fminbnd(‘’-x*sin(x)’,1.28,5.12) produces the value 


x = 2.0288 instantly. 


10.4.2 Exercises 


(a) Find a bracket for the minimum of the function 32 


f(x) =x + 1/x. Start at x = 0.1 and h = 0.2. B 


(b) Use two cycles of the quadratic approximation 
to obtain an estimate of the minimum. 

(c) Use one cycle of the cubic approximation to 
obtain an estimate of the minimum. 


The function 


_ sinx 
= 2 
1+х 


S9 


has been computed as follows: 





x —0.5 0 1 3 
f —0.3825 0 0.4207 0.0141 
f 0.3952 1 —0.1506 —0.1075 





Compare the brackets, obtained from the 

bracketing procedure, to be used in calculating 

the maximum of the function: (a) without using 

the derivatives, and (b) using the derivatives. 34 
Use these brackets to perform one iteration of 

each of the quadratic and the cubic algorithms. 

Compare the values obtained from the two 

calculations. 


Starting with the bracket 1 « x « 3, determine an 
approximation to the maximum of the function 


Дх) = хе*- е”) 


(a) using two iterations of the quadratic 
algorithm, and 

(b) using two iterations of the cubic algorithm. 

(c) How many iterations are required to obtain 
three-figure accuracy? 


Use the quadratic algorithm to obtain an estimate of 
the value of x that gives the minimum value to the 
largest root of the eigenvalue equation (that is, 


min [Ama] 
x-À -1 0 
=i £X o emu 
0 -1 x-A 


Use the bracket given by x 2 1 and x - -1. 


Show that if x, 2 x; — ^ and x; 2 x; + ^ then (10.10) 
reduces to 


Qin... 
? h-2h+h 


(Note: This provides a better hand computation 
method than the Lagrange interpolation approach. 
The best x value is chosen and A is replaced by 
Іл after each step.) Show that this formula is a 
numerical form of the Newton-Raphson method 
applied to the equation f’(x) = 0. 


x* =X) 


An interval AB is divided symmetrically at points C 
and D. If AC/AD = AD/AB show that C divides AB 
in the Golden Ratio a = 1/2(3 —/5) = 0.382. ... 
The function f(x) is known to have a maximum in 
the interval a, « x « a;. It is evaluated at the points 04, 
a, and at the golden section points a; = (1 – 02a, * 0a, 
апа аз = oa, + (1 – o)a,. If f(a;) > faz) then the new 
bracket is taken as a, < х « a; if f(a;) — f(a;) then 
the new bracket is taken as a, S x = a,. The method 
is then repeated. Test the method on the functions 


(a) f(x) 2 x sin x with bracket 0, 2.5; 





(b) Дх) = 1 (m x +20) with bracket 
(1-x) x 
1.5, 2.5. 
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10.4.3 


Simple multivariable searches 


As indicated in Section 10.4.1, many multivariable search methods use a sequence of 
single-variable searches to achieve a maximum. The fundamental question is how to 
choose a sensible direction in which to search for the top of the hill with a very limited 
amount of local information. The problem can be visualized when no derivatives are 
available as sitting in a dense fog and trying to get to the top of the hill with only an 
altimeter available. If derivatives are available then the fog has lifted a little, and we 
can now see a few feet around, so that at least we can see which is the uphill direction. 
The criteria for choosing a direction are (a) an easy choice of direction and (b) one that 
gives an efficient climbing method. This is not a simple task, and although the methods 
described in this section are rarely used nowadays, they do provide the basis for more 
advanced methods. Modern methods are difficult to program, not because the basic 
method is difficult but because of the vast amount of remedial action that must be taken 
to prevent the program failing when something goes wrong. The general advice here is 
to understand the basic idea behind a method and then to implement it using a program 
from a reliable software library such as NAG (distributed by Numerical Algorithms 
Group Ltd of Oxford, UK) or a package such as MATLAB. 

Perhaps the most obvious method of choosing a search direction is to use the locally 
steepest direction. For the function /(x,, x;, . . . , x,) this is known to be in the gradient 
direction G = grad f= [Of/0x,, Of/Ox,, ... , Af/Ax,,]|. If the derivatives are not available 
then in current practice they are evaluated numerically. 

The gradient direction can be easily proved to give the maximum change. From a 
given point (a,, ..., a,), we proceed in the direction (A, /5, .. . , ^,) with given step 
length A so that hî + hå + . . . + h? = h’. We then require 


max F-f(a,*hy,...,a,* hj) – а, ...,а,) 
subject to the constraint 
+... +02 = 2 


The problem is one of Lagrange multipliers, so 


фы дра, edi so 


oh, oh, 
Thus 
Of _ Of _ = 
a^ m -2Àh; (i=1,...,n) 
and hence 


[^ h,, ..., h,] is proportional to ab. Ж |-° 
Xn 


The method of steepest ascent (or steepest descent, for minima) then proceeds from the 
point a by choosing the gradient direction G (or —G for minima) for a search direction. 
We therefore need to find 


maxíg(f) 2 f(a, + tG,, a; - 1G, ..., a, * 1G,)]) (10.13) 
t 


Figure 10.13 
Steepest-ascent 
algorithm. 


Example 10.16 


Solution 
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read (Кеуб, а;,..., a,) 
repeat 
{evaluate f(a) and G, = 0f/0x,, G, = Of/0x,,... } 
{maximize g(t) in (10.13) by the cubic algorithm 
to give a new point a & a + t maxG} 
until {G = 0} 


Since (10.13) is a function of a single variable t, the methods of Section 10.4.1 are 
appropriate, and we should expect the cubic algorithm to be used in the optimization. 
Once the best available point in the search direction has been found, a + tnax G, the new 
gradient direction is computed and the whole process is repeated. The algorithm is 
fairly straightforward, and is summarized in Figure 10.13. 

The steepest-ascent (or descent) method has the great advantage of being simple and 
secure, but it has the disadvantage of being very slow, particularly near to the optimum. 
It is rarely used nowadays, but does form the basis of the hill climbing methods described 
in Section 10.4.5. 


Find the maximum of the function 
f&n x)=- - y -Qu- x 
by the steepest-ascent method, starting at the point (0, 0). 


It is clear that (1, 1) gives the maximum, but this example is used to illustrate the basic 
method. The gradient is easily calculated from the partial derivatives 


д д 
2 А-1) 20-х), 204-0) 
1 2 
Cycle 1: At the point (0, 0), /— —1, С = [4, 0], the search direction is x, — 4t, x, — 0, 
and we require 


maxíg(t) 7 —(4t — 1)! - 167?! 
Ё 


This can be calculated as ¢,,,, = 0.102 56, so that we can start Cycle 2. 
Cycle 2: The new point is (0.410 25, 0), f=—0.259 and G = [0, 0.8205]. The next 
search is in the direction x, = 0.410 25, x, = 0.82054, and we require 


maxíg(t) 2 —0.1210 — (0.41025 — 0.8205: 
t 


This has the obvious solution fmax = 1, so that we can move to Cycle 3. 
Cycle 3: The new point is (0.410 25, 0.410 25), f=—0.1210, and G = [0.8205, 0]. 
The calculation can be continued until G — 0 to the required accuracy. 


There are one or two points to note from this calculation. The function value is steadily 
increasing, which is a good feature of the method, but after the first few iterations the 
method progresses in a large number of very small steps. The successive search directions 
are parallel to the axes, and hence are perpendicular to each other. This perpendicularity 
is just a restatement of the known result that the gradient vector is perpendicular to 
the contours (see Section 3.2.1). In Example 10.16 the function g(f) is written down 


884 OPTIMIZATION 


explicitly for clarity. In practice on a computer this would not be done, since once the 
search direction has been established, x, and x, are known functions of t only, and by 
the chain rule we have 


dg fdn, dm a , af 
dt Ox; dt j Ox» dt Ox, i + o 


Since x, and x, are known, both f/x, and Of/dx, can be calculated, and G = [G,, G] 
is the known search direction; therefore dg/dt is computed without explicitly writing 
down the function g. 

The major criticism of the steepest-ascent method is that it is slow to converge, and 
so the question arises as to how it can be speeded up. One method in use in many 
programs is to use a fixed number of iterations in the line search, provided the function 
is increased. Some experimentation is required on how to implement this idea, but it 
can lead to significant improvement in speed. 

It is well known that Newton-Raphson methods (see Section 9.4.8 of Modern 
Engineering Mathematics) converge very rapidly, so the same basic idea is tried for 
these problems. It is convenient to employ matrix notation, and indeed most multidi- 
mensional optimization methods are written in matrix form. This gives a compact nota- 
tion, and arrays appear naturally in computer languages. 

Taylor's theorem (see Section 9.4.1 of Modern Engineering Mathematics) can be 
written in matrix form to second order as 


fa, * hi... au h) m fla... a) FG 1A) R (10.14) 
where 
9f Of .. Hf. 
hy Ox, Oxy дх,дх, 
к=: elih Je | | 
hy of of af 
oX, Ox,OX, — дх? 


The form (10.14) can be verified by multiplying out the matrices and comparing 
with the standard form of Taylor's theorem. The vector G is just the gradient vector, 
which is now written in matrix form as a column vector, and J is an n x n symmetric 
matrix of second derivatives called the Hessian (or Jacobian) matrix. 

The Newton method takes (10.14) as an approximation to f, finds the maximum 
(or minimum) of the quadratic approximation, and uses this best value to start the 
cycle again. 

The optimum of (10.14) is given by 9f/gh; 2 0 (i 2 1, 2, ... , n). The first of these 
conditions gives 


1 
0 

0= 2-1 0 0 .. 0]G«1[1 O ... O]JAhe!hÀ')|O| (10.15) 
1 à 


0 


Figure 10.14 Newton 
algorithm. 


Example 10.17 


Solution 
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Noting that, since J is symmetric, for any vectors r and s we have 
м) = (7) 5) = 5), 

and hence (10.15) can be written as 
0=[1 0 0 ... 0](G-^«J A) 

Similarly for the other components we obtain 


0-[0 1 O0 ... 0](G«]J 4) 


0=[0 0 0... 1](G-«J) 

The only way to satisfy this set of equations is to have 
G+Jh=0 

and hence, provided the inverse exists, 
һ=-) С 


The basic algorithm is now very straightforward, as indicated in Figure 10.14. 


repeat 
{a; known, calculate Су, J;} 
{evaluate aiy = a; — J7 G} 
until {sufficient accuracy } 


When Newton’s algorithm of Figure 10.14 converges, it does so very rapidly and 
satisfies our request for a fast method. For a quadratic function it only takes one itera- 
tion so, provided the function looks like a quadratic, the method will be expected to 
converge rapidly. 


Use Newton’s method to find the maximum of 


A=(h+b+ 10)( 2272000 00 


+22+5) 


Here we have returned to the *milk carton? problem of Example 10.2. We can calculate 


дА _ (22299 2). 9) ep 10) 2272 000 
oh hb Wb 
= (222200 DUO 35s s) +(h+b+ 10) (22200 920 2) 
дь hb b 
д°А 2 272 000 


2“ =2(b+ 10) 
oh hb 
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dA _ 5, 10x2272000 
ðb ðh hb? 
гл =й 22272 000+ 10) 
ðb hb 
| 100 —44.92 
Iteration 1 а= {| G= s J= mum 2и 
| 100 3054 2.227 8.998 
| 100 0.2249 —0.0557 | | -44.92 131 
ару = m = 
|100) |-0.0557 — 0.1249 || 375.1 50.6 
[131 —52.3 , 
Iteration 2 a= ; G= i J= 2n a 
| 50.6 —465.7 2.517 41.75 
[131 —0.4407 —0.0266 || -52.3 141.7 
йы» = = = 
| 50.6| |-0.0266 0.0255 || 465.7 61.1 
[1417 —4.493 —1.858 2.303 
Iteration 3 a- : G= ; J- 
| 611 —98.76 2.303 25.33 
[141.7 0.6067 0.0552 || 4.493 139 
а реу = = = 
| 611] |-0.0552 0.0445] |-98.76 65.2 





The iterations converge very rapidly; A = 139, b = 65 is not far from the solution, and 
gives A = 827 cm’. 


Since the Newton method involves matrices MATLAB proves to be a very suitable 
package to perform the manipulations. The simple instructions 


e КОРЕ) 
Гаел етот) Ez URNA 


with the last line repeated, gives the successive iterations of the example. The 
following listing gives the m-file newton.m that 1s used for Example 10.17. 


function [a,agrad,ajac] =newton(h,b) 

feo lO ite ОООО) Еа о аа = Еа 

асаа (= 222212020210007 0562 95): 

асаа (2) ==> * (—22 7210007 (h*br2)) +2) 

асас (2 (оо) 2020 OOO ESO E 

аш шл) О Л) ОУ ШОКЫ О са ЭО i) saqec (il, 2) ¢ 
ajac(2,2)24-2* (hx10)*22720007 (h*5^3).; 


Figure 10.15 
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Unfortunately Newton’s method is very unreliable, particularly for higher-dimensional 
problems, unless the starting value is close to the maximum. The reason for this is fairly 
clear. The analysis given only uses the necessary condition for a maximum, but it would 
apply equally well to a minimum or saddle point. In multidimensional problems saddle 
points are abundant, and the usual failure of the Newton method is that it proceeds 
towards a distant saddle point and then diverges. 

Applying the method to the Rosenbrock function, called the ‘banana’ function in 


MATLAB, 


f(x, y) 2 1006 — xy’ € (1 х) 


the unreliability, but rapid convergence, is illustrated in Figure 10.15. 





Behaviour of Newton’s Starting point Iterations to convergence Final point Comments 

method for the Жр 

Rosenbrock function (71.9, 1) 1 (1, 1) ПИП 

for various starting (71.9, 0.9) 6 (1, 1) minimum 

points. (-1.9, 0.5) 6 (1, -1) minimum 
(71.9, 0) run 1 — — aborts, J is singular 
(71.9, 0) run 2 1 (1/101, 0) saddle point 
(71.9, —0.5) 7 (1, 1) minimum 
(71.9, -1) 1 (1, -1) minimum 

10.4.4 Exercises 
35 Follow the first two complete cycles in the 37 Use the steepest-ascent method to find the 


36 


steepest-descent algorithm for finding the 
minimum of the function 


T 1 lr 1 -l 
Р, %») =x lo +5% ie {5 


1 
starting at a — | : 


Show that the function 
f(x, y) = W(x + yy * (x — yy + 3x + 2у 


has a minimum at the point (-2, -3). 

Starting at the point (0, 0), use one iteration 
of the steepest-descent algorithm to determine 
an approximation to the minimum point. 

Show that one iteration of Newton’s method 
yields the minimum point from any starting 
point. 


38 


99 


maximum of the function 
Хх, у,2) = -(к-у+2) - Qx*z-2yY - ^ 1)? 


starting from (2, 2, 2). 


Minimize the function 
fé. yz) e (x - y zy * Qxez-2yY «c? - 1Y 


by Newton's method. starting at (2, 2, 2). 


A new link road is to be constructed from a city centre 
to an existing road. In suitable coordinates and units 
the city centre is at (0, 0) and the existing road has 
equation y = 11 — 2x. The cost of construction is 
proportional to the length of road, but it is twice as 
expensive to construct the road in the urban region 
|x | S 1 compared with outside the region. The link 
road consists of two straight sections, inside and 
outside the urban region. Find the equations of the 
two sections that minimize the overall cost. 
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10.4.5 


Advanced multivariable searches 


To overcome the problems of evaluating second derivatives, which are rarely available, 
and of the unreliability of the Newton method, but to use its speed of convergence, 
several methods were produced in the early 1960s. Two have survived and are now the 
methods currently available in most program libraries. Conjugate-gradient methods 
give one approach, but these will not be described here. We shall look at a method due 
to Davidon, commonly called DFP (after Davidon, Fletcher and Powell, who developed 
the method) or quasi-Newton methods. There is a whole class of such methods, but 
only one will be studied. 

The basic idea of the DFP method is to look for the minimum of the function 
fx... ,x,) with gradient given by the column vector G=[df/ox, of/ox, ... of/ox,]' 
by iterating with a matrix H,, which will be updated at each iteration, so that 


a; 7 a; - AHG, (10.16) 
The reliable but slow steepest-descent method chooses H, = I, the unit matrix, and À = À min 
while the less reliable but fast Newton method chooses H, = J ;' and A= 1. Thus 
the idea is to compute a sequence of H, so that H, = land H, — J ~ as the minimum is 
approached. The basic analysis required to implement this scheme is quite difficult and 
beyond the scope of this book, so only the briefest of outlines will be given. For a 
quadratic function 
f-2ctxlG4 1х1) X; 
at two successive points the gradient is given by 
С=с +) х, 
Ga = G+J] X 
so, subtracting, 
G4 -G;-)J Ga - x) 
Writing A; 2 x;,, — x; and y; 2 G4; — G, and working on the assumption that HJ = I 
since we require H, — J ~, we obtain 
Ну, = НЈ л;= А, (10.17) 
It is found to be impossible to satisfy (10.17) unless an exact solution is known, so the 
next best thing 1s to satisfy (10.17) one step behind as 
Ну =, (10.18) 
There is a whole class of matrices that satisfy the key equation (10.18) but only the 
original Davidon matrix (and still one of the best) is quoted here, namely 
H;y;y;H, uhi 
у Huy; hiy; 
It can be shown that for a quadratic function this sequence of Hs produces J ~ in 
n iterations, where n is the dimension of the problem. The basic algorithm is described 
in Figure 10.16 for the minimum of a general function f(x,, . . . , x,) with gradient 
G=[odflox, ... Əflðx T". 
This method was a major breakthrough in the early 1960s, and is still one of the best 
and most reliable available. Proofs of convergence and computational experience are 


H,,, =H, - (10.19) 


Figure 10.16 
DFP algorithm for 
the minimum of 


Лоа, 0... 33). 


Example 10.18 


Solution 


Iteration 1 
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read {initial values xo, Hy = I} 
{calculate the gradient Gy and fy} 
repeat 
{Find min f(x, — AH,G,), by the cubic algorithm} 
(Put x; 7 xi - A4 HG; 
{Calculate fi, Gu, and hence h; = x,,, — x; 
and yi = С — Gi} 
(Update H; to H;,, by (10.19); 
until (sufficient accuracy] 


available in advanced texts on optimization. To repeat a word of warning: these pro- 
grams are very long and tedious to write because of the large amount of checking and 
remedial work that has to be inserted to prevent the program stopping. Such programs 
are available in software libraries, and these should be used. 


Use the DFP method to find the minimum of 
fe y) 2 x! c y* * Ох + y - 5Y 
starting at (0, 0). 


Note that only the first derivatives 


G- à 
4у +2(2х+у- 5) 





40 +4(2х+у- 2 


are required in this method. 


and we search in the direction 


‚Ше a). i 


for a minimum of f; that is, we compute 


min{ (20A)* + (104^ + (50-5) } 
А 


The cubic algorithm gives A pin = 0.064 14, so 


1.2828 1.2706 
а, = A G, = 
0.6414 —2.5308 


1.2828 —21.2706 
h = > Јо = 
0.6414 7.4692 


890 OPTIMIZATION 


Thus H is calculated from (10.19) as 


l= 


_ | 01611 —0.2870 
—0.2870 0.9031 


å 1.282 
Iteration 2 а; = | d | 


. &= 1.2706 0H 0.1611 —0.2870 
0.6414 


- , fi = 6.092 
—2.5308 —0.2870 0.9031 


We now search in the direction 
s 1.2828| _ А 0.1611 —0.2870 | 1.2706 
0.6414 —0.2870 0.9031 | |—2.5308 
_ | 1.2828 - 0.93094 
0.6414 + 2.65012 


and the cubic algorithm gives A min = 0.272. We can now compute the next point and its 
gradient as 


m= 1782| g 02761) зо 
0.9390 0.0969 


The new Davidon matrix is calculated from (10.19) as 


H.- 0.0597 | —0.0050 
—0.0050 0.1191 


Iteraíion 3 The next iteration gives 


j —0.021 
g= LIETI «e. 00102) pa OGL =O аад 
0.9452 0.0192 -0.0216 0.1019 
The iterations continue until convergence at x = 1.1886 and y = 0.9434. 


In current practice the steepest decent method is rarely used because it is too slow. 
Newton’s method requires a matrix of second derivatives, which are not usually available, 
and, although fast, it is too unreliable as illustrated in the previous section, particularly 
in higher dimensional problems. The quasi-Newton methods, however, are widely avail- 
able in most libraries and packages. They are reliable, very competitive and compare 

favourably with other well used methods, such as conjugate gradients. In MATLAB the 
optimization routine fminunc, DFP and steepest descent are available, but a variant 
BFGS (Broyden, Fletcher, Goldfarb, Shanno — see Exercise 40) is the default method. 
In the Optimization Toolbox of MATLAB the ‘unconstrained nonlinear’ option contains 
a demo of the three methods on the Rosenbrock function (it was also used to illustrate 
Newton’s method in the previous section). 


Дх, у) = 100° -+ (1 -x 
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There are some first class graphics showing the progress of the method and the con- 
vergence found from the starting point (—1.9, 1) is 


BFGS 34 iterations 50 function evaluations 
DFP 40 iterations 64 function evaluations 


Steepest descent exceeds limit 250 function evaluations 


This performance is typical of the methods. The complication of the methods and 
the intimate relation with computers illustrates the need for packages to perform the 
extensive arithmetic. For the milk carton problem, Example 10.2, the function infor- 
mation 1s put in the M-file milk.m 


Punct c ro Еа аа) 
REM DESEE (EE ЕВ) ЕЕ src (o DEL 
НЕЕ ПЕК СОМ ЕЁ 
CP E 22 OT ЕВЕ ОВЕ ee (2 sea) = (э ее (ҖЕ) 
Beo OA с БЕ 
SK E EON P ЕВ ЕБЕ ОБЕ) ЕЕЕ (БЕВ 
(оул ШОк ШЕ ОБОО Он 
епа 


The instructions 


x0=[100; 100); 
options=optimset (‘GradObj’,’on’,’display’,’iter’); 
[x, fval]=fminunc (@milk,x0,options) 


produce the minimum of 827 cm? with // 2 138.6 mm and b= 65.7 mm in six iterations. 


A major development since the 1970s has been in devising modifications that avoid 
the line searches. It was found that the latter were very time-consuming, so there was 
great pressure to avoid them. The searches were replaced by one or more steps in the 
search direction until the function has been reduced ‘sufficiently’ and then the matrix 
H is updated. This not only reduces the number of function evaluations, which is nor- 
mally the most expensive part of the routine, but it is found to reduce the number of 
iterations required. The BFGS variant is found to be very suitable for this approach and 
it is used in the MATLAB implementation. What is meant by ‘sufficiently’ requires 
careful consideration and a discussion can be found in R. Fletcher, Practical Methods 
of Optimization, Volume 1, Unconstrained Optimization, Wiley, 1980. 

The early development of numerical optimization algorithms led to very distinct methods 
for cases when derivatives were or were not available. As computers developed in 
speed, this difference became less necessary, since derivatives could be calculated very 
rapidly by a numerical method. Options are now built into packages and library routines 
which use derivatives when supplied, or use a numerical approximation if the deriva- 
tives are not supplied. 

Adapting methods to deal with constraints is of intense interest in practice, and has 
been a major thrust in the subject. For fully nonlinear problems with nonlinear con- 
straints the practical difficulties are very severe, but, robust programs are now available 
in most libraries and packages. 
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10.4.6 


Least squares 
When minimization problems are in the form of least squares 
Е = (х, X». 


then there is a slightly different and potentially more efficient technique that exploits 
the specific structure of the problem. In matrix form F becomes 


s » Xn) tf 5n X», Sce P +... t fons X» eren » Xn) 


Л 
=. ЛЛ 
Sn 
Each of the functions is expanded by Taylor’s theorem, in matrix form, to first-order 
about x = a, for example putting x, = a, +h,,x,=a,+h,,... forf, 
hı 
Of, OF, of, | |h; 
= (Xis Xz, < © ©, Xa) = Apl +50) | M 
fi = 1, % )-fi (as, a5 ) ж Bell 
h, 


The partial derivatives are evaluated at x = a. Repeating for all the m functions gives in 
matrix form 


A of A 
Ox, Ox; Ox 
fi f 1 2 n h 
„|! |%& 38, 38 
: : +] Ox, д, `` д, ||? 
Tn Tn x =a Ofn 97, дй» h, 
Ox; Ox, OX, 
or in a more compact notation 
f(x) =f(a)+Jh (10.20) 
The minimum of F requires 
fi 
о= 2522191 B Waf 
ox; dx, Ox, axl: 
Jm 


Repeating this differentiation for each of the variables and putting the equations into 
matrix form 0 =J "f(x) and using (10.20) gives 


02J 'fGo) - ) 'Gf(a) * ) h) 


Example 10.19 


Solution 


Example 10.20 
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Hence we can now compute h as 


h=-J Jy I'S@ 


provided, of course, that the inverse exists. This is the same result that was quoted in 
Section 1.8.3 on singular value decomposition. To compute the minimum the result is 
iterated as 


a,=4,-O0) 1f (10.21) 


Find the minimum of the function, starting at (0, 0), 


Е=(х+у— 1) +(х-у+1)°+(2х-у)* 


х+у- 1 1 1 
f-x-y*1l J^ -1 
2x-y 2 -1 


so at the start point 


=] 
f- | mid dr) en | 4 
1 


From (10.21) the new value of a is (2/7, 6/7) with F = 2/7. Because all the functions are 
linear the minimum is obtained in one iteration. 


MATLAB has a built-in procedure to solve this type of problem; the following 
instructions obtains the same result quickly 
(e ИАН ИЕ Е rS ST OTT E ME MG 
x=lsqlin(c,d,[ ],[ ]) % Can use linear equality and 
inequality constraints, [ ] indicates empty 


The experimental data 





is thought to fit the function 


aX 
1 +bX 





f= 


Estimate the values of a and b and compare. 
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Solution 


Figure 10.17 
Fit of the function in 
Example 10.20. 


The minimization of the least squares function 


4 
Же ү 
F $ b) = uem TE 
Su X 1 +bX, 
should give good estimates of a and b. Note that the origin automatically satisfies the 
function. Take 





-X, aX, 
1+bX, (1-bXy 
f= andj =| Æ% aX 
1+bX, (1 +bX,) 





Starting at a-1 and b-0.5 the iteration (10.21) was written in MATLAB and quickly 
gives a converged value of a=1.0779 and b=0.4764. However starting from (1, 1) 
the method quickly diverges. It is important that a good starting guess is known. The 
code 


ezplot('1.0778*x/(1-0.4764*x)',[0,10]), hold on 
ploót(0,0,4*^;,1.2,0.8,*^**,2,1.15,7*:,5,1,55,***,10,1. 89, C**) 


produces Figure 10.17, which gives an illustration of how good the fit is. Note that least 
squares often gives a better fit to experimental data than interpolation methods, such as 
splines, since rogue points are not dominant in the method. 


1.0778x/(1 + 0.4764x) 





18r 
16. 
14 
12 


0.8 F 
0.6 F 
04r 
0.2 H 








40 


41 


42 
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MATLAB has built-in procedures to deal with exactly this type of curve fitting. The 
following instructions obtain the same result. 


О ТОЕ Ее ПОО ОТЕ ЕБИ ОЕ оставе 


the data 


gg=@ (b, xd)b(1) *xd./ (1+b(2)*xd); % sets up the test function 


$ 


x-lsqcurvefit(gg,[1 1],xd,yd) $ returns the values 


х = 1.0779 0.4764 


A much wider range of start values can be used since 1sqcurvefit can access a 
variety of methods of solution and contains numerous safeguards. 


10.4.7 Exercises 


Minimize the following functions by the DFP 
method, completing two cycles: 


(a) f(x, y) 2 (x — yy *- 4(x — D, starting at (2, 2); 


(6) Д, у, 2) = (х у + 2)? + (2х +2 2) + гї, 
starting at (0, 0, 0). 


Show that the updating formulas 


ü) H,,2H;* (h; - Hiy))(h, - Hy) 


T rank (1) 
(A; - Hy) y; 
and 
T T 
(ii) Hj, =H, +] 1 +P |A; a2 
h; y; hi y; 
T T 
- [n H, Iun (BFGS) 
h; y; 


satisfy (10.18). Follow the DFP method through for 
two cycles, but using these updates for H, on the 
functions 


(a) x? + 2у2, starting at (1, 2); 
(b) x? + (х= у + 1)? + у22>, starting at 
(0.5, 0.5, 0.5). 





Show that the update formula (with suffixes 
suppressed) 


H'-H-vp! - Huq* 
satisfies the basic quasi-Newton equation (10.18) 


H'u-v 


where p and q are vectors satisfying 
pu-l 4и=1 


but are otherwise arbitrary. 
By making a suitable choice of œ, p and œ’, p’ in 
the expressions 


р= оо + ВНи 
q= œv + В'Ни 
show that the Davidon formula (10.19) and formula 


(i) in Exercise 41 can be obtained. 


An alternative algorithm for finding the minimum 
of a function, f(x) with gradient g, of several 
variables is due to Fletcher and Reeves. Starting 

at the point xo, the first search direction is chosen as 
Po =—&o-. Successive search directions are given by 


T 
S; Si 
pg ——pia 
8п-181-1 
and successive points satisfy 
х= хі + Ар 


where A, is chosen to minimize the function f(x) 
in this search direction. 

Apply the method to the functions 
(a) f= 5 (3х? + у?) апа 


(b) fe (x- y 1? «xy? - (z- 1Y 
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10.5 Епоіпеегіпе арріісаіоп: chemical processing plant 


A chemical processing plant consists of a main processing unit and two recovery units. 
Chemicals A and B are fed into the plant, and produce a maximum output of 100t day ! 
of material C. The effluent stream is rich in chemical B, which can be recovered from the 
primary and secondary recovery units. The total recovered, when at full throughput, is 
10 t day ' of pure B, and it is fed back into the incoming stream of chemical B. The process 
is illustrated in Figure 10.18; the numbers in parentheses indicate the maximum flow (in 
t day !) that can be sent down the pipes, and the x; indicate the actual flow (in t day !). 

The chemistry of the process implies that the chemicals must be mixed in given 
ratios. For the present system it is found that 


Xx 1d. Xpx,93:5, Xxx. Hex SH 53 
XSX2,—03, XiX = Orly, wixig= 6:9 


must be maintained for any flow through the system. Chemical A costs £100t ', chemical B 
costs £120 t! and chemical C sells for £220 t !. The running costs are as follows: 





Variable costs Fixed costs 
Process unit £70t of product £500 day” 
Primary recovery £30t' of input £200 day” 
Secondary recovery £40t' of input £100 day” 
Disposal of waste £30t! 
Indirect cost £400 day! 


It is required to find the most profitable operating policy that can be achieved. 
The profit can be written down for a day's production as 
z = —(fixed costs) - (profit from sale of C) — (costs of chemicals A and B) 
— (process unit costs) — (primary unit costs) — (secondary unit costs) 


— (waste product costs) 


= —1200 + (220x,) — (100x, + 120x,) 
— (70x) — (30x5) — (40x7) 
— 30X49 


z = —1200 — 100x, — 120x, + 150x, — 30x; — 40x, — 30x, 


Figure 10.18 x; (60) X, (100) 
Schematic diagram A 
of a chemical 

processing plant. 

The numbers in 
parentheses give the 
maximum flow. 







Primary 
recovery 


Secondary 
recovery 


Xi (10) 
To waste 


Modification to 


processing plant. 
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The constraints on the flow, given by the maximum throughput, are 
x, = 60, x, S50, x, S60, x,< 100, x,<20, x,<8, x, < 12, 
X4 €- 2, x, S10, x,,« 10 
The constraints on the chemistry given by the fixed ratios can be written in a convenient 
form as 
x,-x,;=0, 5x,-3x,=0, x,—3x,2-0, 3x,—5x,-0, 2x,— 5х, = 0, 
xX,—- 6x,=0, 5x,—- 6x, 20 
Finally, at the junctions J and K, continuity (what flows in equals what flows out) gives 
XQ X4Q—X4-0, x,Tx,—x,-0 


The problem thus has 10 variables, 10 inequality constraints and 9 equality constraints. 
The choice is whether to use the equality constraints to eliminate some of the variables 
or just to treat the 19 constraints directly by LP. The equations are sufficiently simple 
to solve for the variables as 


2 _1 _1 
% 75% Ж 3 


= 5 = =3 =! = 
X= FX, X% =X, X4=3Xp X5=3Xp Xg—7d 


2х}, х ix, z= 21x; — 1200 


Xis 

Xg = 
Thus x, must be as large as possible, that is, at the value 60, giving a maximum profit 
of £420 day. We must check that all the constraints are satisfied, and indeed this is the 
case. It is easily seen that each variable reaches its maximum possible value indicated 
in Figure 10.18. 

When we look at variations on the problem, it becomes less clear whether to elimin- 
ate or just to use LP directly on the modified equations. For instance a very sensible 
question is whether it is worth using the primary or secondary recovery units. We can 
consider this question by allowing a portion to go to waste (at the same cost given 
previously), as indicated in Figure 10.19. 

We add to the previous continuity equations similar equations for the junctions L 
and H: 

Xs — Xyy— Xp. =0 

X7— X13 — X14 = 0 
The inputs to the primary and secondary units are now x,, and x,,, so we need to modify 
the fixed ratio chemical constraint as follows: 

replace 3х; – 5х; =0 by 3х, – 5х; 2-0 


replace 2x,— 5x; 20 Бу 2х, – 5х; = 0 


ху To waste 


хуз To waste 


Xio To waste 
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replace  x;— 6x, 20 by Á x4-6x 20 
replace 5x,— 6x20 by 5x4-6x,420 


In the cost function x; is replaced by x,,, and x, by x,,, and the additional waste costs 
(730x;, — 30x;,) must be added: 


z 2 —]200 4 150x, — 100x, — 120x; 
30x, — 30x1; — 40x14 — 30x,3 — 30x10 





We now have three free variables, so we shall certainly need an LP approach. An LP 
package was used to obtain the solution 


х = 57.69, х = 50, хз = 57.69, х; = 96.15 
х5 = 19.23, х; = 7.60, жу = 11.54, ж = 0 
Xy — 7.09, x4 9, хи = 0, Xiz = 19.23 
x = 11.54, x4= 0 


and the profit is £530.77 day™'. It can be seen that x,, = 0, so nothing is sent to waste 
before the primary recovery unit; but x,, = 0, so that all the material from the primary 
to the secondary goes to waste, and the secondary unit is bypassed. The effect of this 
strategy is to increase the profit by about 20%. 

There are many other variations that can be considered for this model. For instance, 
pumps often go wrong, so it is important to investigate what happens if the maximum 
flows are reduced or even cut completely. Once the basic program has been set up, such 
variations are quite straightforward to implement. 


10.6 Engineering application: heating fin 


A heating fin is of the shape indicated in Figure 10.20, where the wall temperature is 
Ti, the ambient temperature is 7, and within the fin the value T = T(x) is assumed to 
depend only on x and to be independent of y and z. Heat is transferred by conduction 
along the fin, which has thermal conductivity k, and heat is transferred to the outside 
according to Newton's law of cooling, with surface heat-transfer coefficient /. Consider- 
ing an area of the fin of unit width in the z direction and height 2y, the heat transferred 
by conduction in the x direction is k2y d7/dx. The net transfer through the element 
illustrated is (d/dx)(k2y d7/dx). The total heat lost through a surface element of unit width 
in the z direction and length As — [1 4- (dy/dx)?]'? Ax along the surface is A(T- T,)As. Since 
there are two surfaces, we can write the heat-transfer equation as 


a} don = = dyY 
iloga гә 2h(T ZEE] 


Provided that dy/dx is not too large, (dy/dx)? can be neglected, giving 


1/2 


a| dcr- =T- 
hga ro iM - T) (10.22) 
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Figure 10.20 
Heating fin. 





Ambient temperature To 


The mass of the fin is given, so that its cross-sectional area is known, and hence 
| уйх= 14 (10.23) 
Finally, we wish to maximize the heat transfer, so we require the maximum of 


I- Z (T-T) dx (10.24) 


0 


over all possible functions y(x). 

The problem involves choosing a function y(x) that satisfies (10.23) and then solving 
(10.22) for T — T,. This is then substituted into (10.24) and the integral evaluated. 
Out of all possible such functions y, we choose the one that maximizes 7. The scheme 
outlined is extremely difficult and belongs to a class called variational problems. 
An alternative approximate method must be sought. The assumption made is that the 
temperature falls linearly with x as 


T- T, - (Tj - 0 - ax) 


We also assume that y = 0 at x =a. We thus have two free parameters, œ and a, which 
we can use to give an approximate solution. Given T — T}, y can be computed from 
(10.22) as 


y= (a - 1d - x Lox’) 


It can be seen that our basic assumption implies that y is quadratic in x. To satisfy the 
area constraint (10.23), a simple integral is performed to give 


h 

14 = is^ G - laa) 

which gives a relation between o and a. The function Z is now integrated as 
I-2h(T, - j| (1 - ax) dx 


0 
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or 


—— 
m 2 


Thus the very difficult problem has been reduced to a Lagrange multiplier problem of 
maximizing 


f-a-1 aa 
subject to 
2 
g=0= (£ -14` - s? (10.25) 


where 5° = kA/2h. The Lagrange multiplier analysis gives the equations 


448) —.1- qa - A (8 -a')- 0 
да a 


2 
Af Ag) _ 197444. =0 
2а 


до 2 
Непсе 
А= о? 
1—2ша+ о?а?=0 


so that aœ = 1. Substituting back into the constraint (10.25) gives 


1/3 
8° = La’, or a= Ga 


and therefore 
(= тә) = a - To (1-3) 
so that 
y ha (1 _ х) 
a 2k a 
Thus, given the physical parameters k, h and A, the ‘best’ shape can be derived. 
This model shows how a very difficult mathematical problem in optimization can be 


reduced to a much more straightforward one by an appropriate choice of test functions 
for T- Tọ. 
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10.7 Review exercises (1-26) 


| 


Use the simplex method to find the maximum of 
the function 


Е = 12ху + 8x, 
subject to the constraints 
xy x,*-350 
2x, х, < 600 
x, + 3x, = 900 
X1, X_ = 0 


Check your results with a graphical solution. 


A manufacturer makes three types of sailboard 
and is trying to decide how many of each to 

make in a given week. There are 400 h of labour 
available, and the three types of sailboard require 
respectively 10, 20 and 30 h of labour to construct. 
A shortage of fibreglass and of resin coating is 
anticipated. The quantities required by each type 
of sailboard are as follows: 


Total 


Typel Type2  Type3 supplies 5 





Fibreglass 

(kg) 5 10 25 290 
Resin 

coating 


(litres) 3 2 1 72 6 


Ifthe profits on types 1, 2 and 3 are £10, £15 and 
£25 respectively, how many of each type should the 
manufacturer make to maximize profit? 


A motor manufacturer makes a ‘standard’, a ‘super’ 
and a ‘deluxe’ version of a particular model of car. 
It is found that each week two of the materials 

are in limited supply: that of chromium trim 
being limited to 1600 m per week and that of 
soundproofing material to 1500 m? per week. The 
quantity of each ofthese materials required by a car 8 
of each type is as follows: 





Standard Super Deluxe 
Chromium trim (m) 10 20 30 
Soundproofing (m?) 10 15 20 


All other materials are in unlimited supply. 


The manufacturer knows that any number of 
standard models can be sold, but it is estimated 
that the combined market for the super and 
deluxe versions is limited to 50 models per week. 
In addition there is a contractual obligation to 
supply a total of 70 cars (of any type) each week. 

The profits on a standard, super and deluxe 
model are £100, £300 and £400 respectively. 
Assuming that the facilities to manufacture any 
number of cars are available, how many of each 
model should be made to maximize the weekly 
profit, and what is that profit? 


A poor student lives on bread and cheese. 

The bread contains 1000 calories and 25 g 
protein in each kilogram, and the cheese has 
2000 calories and 100 g of protein per kg. To 
maintain a good diet, the student requires at 
least 3000 calories and 100 g of protein per day. 
Bread costs 60p per kg and cheese 180p per kg. 
Find the minimum cost of bread and cheese 
needed per day to maintain the diet. 


Use Lagrange multipliers to find the maximum 
and minimum distances from the origin to the 
point P lying on the curve 


х? -ху+у?= 1 


A solid body of volume V and surface area S is 
formed by joining together two cubes of different 
sizes so that every point on one side of the 
smaller cube is in contact with the larger cube. 
If S=7 m’, find the maximum and/or minimum 
values of V for which both cubes have non-zero 
volumes. 


Find the maximum distance from the point 
(1, 0, 0) to the surface represented by 


Wwt+y?+z=8 


Find the local extrema of the function 
FQ, y, Z) =x + 2y + 3z 
subject to the constraint 
xX +y tz = 14 
Obtain also the global maximum and minimum 
values of F in the region 


xz0, y20, z»0, ту = 14 
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A triangle with sides a, b, c has given perimeter П 
2s. Recall that the area of the triangle, A, is given 
by the formula 


А? = 5(5 – а)(ѕ – Б)(ѕ – с) 


(i) Ifa is given, use Lagrange multipliers to find 

the values of b and c that make the area a 

maximum. 
(П) If a, b, c are unrestricted use Lagrange 

multipliers to find the values that make 

the area a maximum. 

14 

A nuclear reactor is in the form of a circular 
cylinder of radius r and height h. According to 
the theory of nuclear diffusion, the restriction 


2 2 
а T 
БЕЗЕР 
b ) 5 ) 
applies, where a and b are constants. Use Lagrange 


multipliers to find the values of r and л that make 
the volume of the reactor a maximum. 


According to lubrication theory, the lift on a pad 
bearing, where fluid flows in the narrow gap 
between a pad and a fixed piece of machinery, 

is given by 


F=A 





l (mk-2E-T) 
(k- 1)? k+1 


where A is a constant, k= h,/h, > 1 and h, and h, 

are the gap widths at the front and back of the 

pad. Find the value of & that makes F a maximum 

by using the bracket/quadratic approximation 

technique. 15 


A cylindrical can of radius R (cm) and height 
H (cm) is to be made with volume 1000 cm’. 
The cost of making the can is proportional to 


(amount of metal) x (machine factor) 


where the amount of metal is proportional to the 

surface area of the can (including the two ends) 

and the machine factor is given by 1 + [1 — (H/4R)]? 

and reflects the difficulty of machining the can. 16 
Show that the cost is 


2(1000 + nr’ +(1- 10007 
R 4тЁ 


Find a bracket for R and use the quadratic algorithm 
to estimate the radius that minimizes the cost. 


Use the quadratic approximation method to 
obtain a first estimate of the minimum of the 
function 


f)=1-t+0 
where ¢ is the non-negative root of 
t+ix—(1—-x?)=0 


Start with the interval 0 « x « 1 and note that 
for x = 0.5, t= 0.6514 and f = 0.7729. 


In Figure 10.21 the disc rotates at a constant 
angular velocity, so 0 — at. The subsequent 
movement of the slider P gives x = x(t). If 
L/a =A show that the velocity, v, of the 
slider is given by 


dca 
aa 


cos Q 
(12 - sin? 0) 
Use the bracket and quadratic approximation 


technique to evaluate the maximum and minimum 
velocities of the slider in the case A = 3. 





Figure 10.21 Disc and slider in Review 
exercise 14. 


A trucking company estimates that the cost of 
running a truck is 


1 


0.02205 + у) 
10 


pounds per mile at a constant speed v. The driver 
earns £5 per hour. Find the cost for a journey 

of D miles. What speed is recommended to 
minimize the cost? 


Use (a) the steepest-descent method, (b) the 
Newton method and (c) the DFP method to 
find the position of the minimum of the 
function 


Ло, у) + (к-у + LG ey Df 


starting at (0, 0). Perform two cycles of each 
method, and compare your results. 


18 


A compound pendulum consists of a rectangular 
lamina with a heavy particle embedded in it, as 
illustrated in Figure 10.22. For small oscillations 
about the equilibrium, 0 — o, and putting 0 — & + €, 
the equation of motion is 


2с 


с =! 








Figure 10.22 Compound pendulum in Review 
exercise 17. 


where the period of the oscillations u is given, 
after a substantial calculation, by 


аш _ уКА+ У) + 2] 
S8.— X'«Y «AG K) 
with 
X-2xla, Y-2yla, k-bla, A-m/M 
Explore this expression for maximum and 


minimum values in the region |X| = k, Y « 2; 
take the case A=} andk=}. 
A method called Partan uses the notation D® for 
the gradient of fevaluated at x =x. The iteration 
scheme for evaluating the minimum of fusing 


Partan is 
x% =x — yp DY 
a= mu 


| = 2085...) 


A ; | A 
x = Oe OD) 


where u; and À;are chosen by optimum line searches. 
Sketch the progress of this method up to the point 
x for a scalar function of two variables f(x,, x;). 
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Illustrate the use of Partan and the method 
of steepest descent on the quadratic function 


f= =] +(х = 1y 


starting at (0, 0). 


The Newton method described in Section 10.4.3 
often fails to converge. One way of overcoming 
this problem is to restrict the step length at 

each iteration. Given that x = a + h, this can be 
implemented by constraining A to have length L, 
where 


hh = L 
Use a Lagrange multiplier A to show that the 
result gives 


Xnew = Хоа g is ADG 


The algorithm is then implemented by 
successively using A = 0, 1, 10, 100, ... 
until a reduction in the function f is 
obtained. 

Starting atx=[1 1]!, perform one 
complete step of the modified algorithm on 
the function 


f=x +y -xy 
Use the method developed in Section 10.4.6 to 
iterate to the minimum of the functions 


(a) F=(x-y} +$ (x+y +1), starting at 
x=0,y=0; 


2 2 
(b) F= (=) + (TL , Starting at 
XED DL? XC 


xzl,yz-l. 


It is known from experience that a curve of 
the form 


у= Ша + Бх) 


should give a good fit to experimental data in the 
form of a set of points 


(х,у) (= 1, 2,...,р) 


It is required to calculate a and b by a best 
least-squares fit, and thus to minimize 


p 1 2 
кел) 
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22 


23 


24 


Use the least-squares algorithm described in 
Example 10.20 to fit the function to the data points 


(0,1) (1,0.6, (2, 0.3), (3, 0.2) 
А quadratic function f(x, X2, . . . , X„) with a 
unique minimum is given in matrix form as 
f=ctb’x+ixtAx 
Show that a search in the direction 
x=atdd 
produces the minimum at 
Ж з —(Ь+ Аа) 4 
min d'Ad 


Complete two complete cycles of the steepest- 
descent algorithm for the function 


Р, у) = (1-х) * (x у) 


starting at x 2 0, y 2 0. Use Review exercise 22 
for the minimization in the search directions. 

Show that the minimum is obtained in a 
single iteration of the Newton method. 


(Harder) It is required to solve the differential 
equation 


ИЕ ЕВ 


with the boundary conditions y(0) 2 1 and 
y(1) 23, by a shooting method. The equation 
is solved for the initial conditions 


y(0-21, у'(0) = о 


25 


EJ 


26 


by any suitable method (for example, by a 
Runge-Kutta method). With this solution, 
calculate 


F(o) 2 y(1) 


and then try to drive F to the value 3 by 
minimization of 


[F(o) – 3] 


In this example illustrate the method by using 
the exact solution 


у= (1+ р)е – р 


for the forward integration. 


(An extended problem) In the chemical 
processing plant model in Section 10.5 
consider the profits when 


(a) the primary pump is faulty and the 
constraint x, « 12 is imposed; 

(b) the waste pump between primary and 
secondary fails so that xı, = 0 (see 
Figure 10.19). 


(An extended problem) Extend the heating 
fin analysis in Section 10.6, using a higher 
approximation to the temperature: 


T- T2 (1, - 1 - ox - Bx?) 


Compare the shape of the fin with that given 
in the text, and compute the heat transferred 
in each case. 
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Introduction 


Applications of probability and statistics in engineering are very far-reaching. Data from 
experiments have to be analysed and conclusions drawn, decisions have to be made, 
production and distribution have to be organized and monitored, and quality has to be 
controlled. In all of these activities probability and statistics have at least a supporting 
role to play, and sometimes a central one. 

The distinction between applied probability and statistics 1s blurred, but essentially it 
is this: applied probability is about mathematical modelling of a situation that involves 
random uncertainty, whereas statistics is the business of handling data and drawing 
conclusions, and can be regarded as a branch of applied probability. Most of this chapter 
is about statistics, but Section 11.10 on queueing theory is applied probability. 

When applying statistical methods to a practical problem, the most visible activity is 
the processing of data, using either a hand calculator or, increasingly often, a computer 
statistical package. Either way, a formula or standard procedure from a textbook is 
being applied to the data. The relative ease and obviousness of this activity sometimes 
leads to a false sense that there is nothing more to it. On the contrary, the handling of the 
data (by whatever means) is quite superficial compared with the essential task of trying 
to understand both the problem at hand and the assumptions upon which the various 
statistical procedures are based. If the wrong procedure is chosen, a wrong conclusion 
may be drawn. 

It is, unfortunately, all too easy to use a formula while overlooking its theoretical 
basis, which largely determines its applicability. It is true that there are some statistical 
methods that continue to work reliably even where the assumptions upon which they 
are based do not hold (such methods are called robust), but it is unwise to rely too 
heavily upon this and even worse to be unaware of the assumptions at all. 

The conclusions of a statistical analysis are often expressed in a qualified way such 
as ‘We can be 9595 sure that . . ^. At first this seems vague and inadequate. Perhaps a 
decision has to be made, but the statistical conclusion is not expressed simply as ‘yes’ 
or ‘no’. A statistical analysis is rather like a legal case in which the witness is required 
to tell ‘the whole truth and nothing but the truth’. In the present context ‘the whole 
truth’ means that the statistician must glean as much information from the data as is 
possible until nothing but pure randomness remains. ‘Nothing but the truth’ means that 
the statistician must not state the conclusion with any greater degree of certainty or 
confidence than is justified by the analysis. In fact there is a practical compromise 
between truth and precision that will be explained in Section 11.3.3. The result of all 
this is that the decision-maker is aided by the analysis but not pre-empted by it. 

In this chapter we shall first review the basic theory of probability and then cover 
some applications that are beneficial in engineering and many other fields: the statistics 
of means, proportions and correlation, linear regression and goodness-of-fit testing, 
queueing theory and quality control. 


Review of basic probability theory 


This section contains an overview of the basic theory used in the remainder of this 
chapter. No attempt is made to explain or justify the ideas or results: a full account 
can be found in Chapter 13 of Modern Engineering Mathematics or elsewhere. For the 
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same reason there are no examples or exercises. In the process of reviewing the basic 
theory, this section also establishes the pattern of notation used throughout the chapter, 
which follows standard conventions as far as possible. No reader should embark on this 
chapter without having a fairly thorough understanding of the material in this section. 


The rules of probability 


We associate a probability P (4) with an event A, which in general is a subset of a sample 
space S. The usual set-theoretic operations apply to the events (subsets) in S, and there 
are corresponding rules that must be satisfied by the probabilities. 


Complement rule 
Р(5 – А) = 1 – Р(А) 
The complement of an event A is often written as A. 
Addition rule 
P(A U B) = P(A) + P(B) - P(A A B) 
For disjoint events, A N B = Ø, and the addition rule takes the simple form 
P(A U B) = P(A) + P(B) 
Product rule 
Р(А П В) = Р(А)Р(В | А) 


This is actually the definition of the conditional probability P(B | A) of B given A. If A 
and B are independent then the product rule takes the simple form 


P(A M B) = P(A)P(B) 


Random variables 


A random variable has a sample space of possible numerical values together with a 
distribution of probabilities. Random variables can be either discrete or continuous. 
For a discrete random variable (X, say) the possible values can be written as a list 
{V, V>, V3, ... } with corresponding probabilities P(X 2 vj), P(X 2 v), P(X 2 v), .... 
The mean of X is then defined as 


Ico Y v PCS v) 
k 


(sum over all possible values), and is a measure of the central location of the distribution. 
The variance of X is defined as 


Var(X)- 0- V (&- ui PQC- v) 


and is a measure of dispersion of the distribution about the mean. The symbols u and 
O? are conventional for these quantities. In general, the expected value of a function 
h(X) of X is defined as 


ЕХ) = У ADP = v) 
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Figure 11.1 
Probability of interval 
from density function. 


Ао) 


о X, X3 x 


of which the mean and variance are special cases. The standard deviation Oy is the 
square root of the variance. 
For a continuous random variable (X, say), there is a probability density function 


fx (x) and a cumulative distribution function F, (x). The cumulative distribution function 


is defined as 
Fy(x) = Р(Х = х) 


and is the indefinite integral of the density function: 


Fy(x) = | Жа): 


—оо 


These functions determine the probabilities of events for the random variable. The 
probability that the variable X takes a value within the real interval (x;, x;) is the area 
under the density function over that interval, or equivalently the difference in values of 
the distribution function at its ends: 

х2 

Р(х < X < х) = | fx (4) dt 2 Fy(x;) - Fx) 

al 
(see Figure 11.1). Note that the events x, < X < x,, x, € X «€ x;, x, € X « x; and 
X, € X < x, are all equivalent in probability terms for a continuous random variable 
X because the probability of X being exactly equal to either x, or x, is zero. The mean 
and variance of X, and the expected value of a function A(X), are defined in terms of 
the density function by 


n | x fy (x) dx 
Man О = | (х- ду) f(x) dx 
E{h(X)} = | A(x) f(x) dx 


These definitions assume that the random variable is defined for values of x from —ee to 
co, If the random variable is defined in general for values in some real interval, say 
(a, b), then the domain of integration can be restricted to that interval, or alternatively 
the density function can be defined to be zero outside that interval. 
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Just as events can be independent (and then obey a simple product rule of probab- 
ilities), so can random variables be independent. We shall consider this in more detail in 
Section 11.4. Means and variances of random variables (whether discrete or continuous) 
have the following important properties (X and Y are random variables, and c is an 
arbitrary constant): 


жу у= ж у= л 
Мао у= ММ о. 
E(X4 c)2 EX) t c2 ug c 
Var(X + c) = Var(X) 2 02 
EX + Y) - Е(Х) + Е(Ү) = My + My 
(this applies whether or not X and Y are independent), 
Var(X + Y) = Var(X) * Var(Y) 2 02 oF 


(this applies only when X and Y are independent). 
It is also useful to note that Var(X) = E(X?) — [E( X p. 


The Bernoulli, binomial and Poisson distributions 


The simplest example of a discrete distribution is the Bernoulli distribution. This has 
just two values: Х = 1 with probability p and X = 0 with probability 1 — p, from which 
the mean and variance are p and p(1 — p) respectively. 

The binomial and Poisson distributions are families of discrete distributions whose 
probabilities are generated by formulae, and which arise in many real situations. The 
binomial distribution governs the number (X, say) of ‘successes’ in n independent 
‘trials’, with a probability p of ‘success’ at each trial: 


Pa = ("а ру 


where the range of possible values (K) is (0, 1, 2, .. ., nj. The binomial distribution can 
be thought of as the sum of n independent Bernoulli random variables. This distribution 
(more properly, family of distributions) has two parameters, n and p. In terms of these 
parameters, the mean and variance are 


Uy = np 
сх = np(l — p) 
The Poisson distribution is defined as 


k >À 
PX-k- ас. 


where the range of possible values (X) is the set of non-negative integers (0, 1, 2, .. . ). 
This has mean and variance both equal to the single parameter A (see Section 11.7.1), 
and, by setting A = np, provides a useful approximation to the binomial distribution that 
works when n is large and p is small (see Section 11.7.2). As a guide, the approximation 


can be used when n 7 25 and p x 0.1. The Poisson distribution has many other uses, 
as will be seen in Section 11.10. 
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The normal distribution 


This 1s a family of continuous distributions with probability density function given by 


fe) = туе - Cy 


for —eo « x « ceo, where the parameters LL, and Oy are the mean and standard deviation 
of the distribution. It is conventional to denote the fact that a random variable X has a 
normal distribution by 


Х~ Жи, бз) 


The standard normal distribution is a special case with zero mean and unit variance, 
often denoted by Z: 


Z~ N(0, 1) 


Tables of the standard normal cumulative distribution function 





= z= z 1 : -t?/2 
@(z) = P(Z « т) ca | edt 


are widely available (see for example Figure 11.2). These tables can be used for 
probability calculations involving arbitrary normal random variables. For example, if 
X ^ N(Iiy, 05) then 


P(X <a) = P(A < аш) (ш) 
Ox Ox Ox 
The key result for applications of the normal distribution is the central limit theorem: 
If (X, X5, X3, ..., X, ) are independent and identically distributed random variables (the 
distribution being arbitrary), each with mean Ly and variance 07, and if 


у +X, р К+. nu 


n Oxn 


then, as 1 — ce, the distributions of W, and Z, tend to W,, ~ N( Ly, 0%/n) and Z,, ~ №0, 1) 
respectively. Loosely speaking, the sum of independent identically distributed random 
variables tends to a normal distribution. 

This theorem is proved in Section 11.7.3, and in the key to many statistical processes, 
some of which are described in Section 11.3. One corollary is that the normal distribution 
can be used to approximate the binomial distribution when 7 is sufficiently large: if X 
is binomial with parameters n and p then the approximating distribution (by equating 
the means and variances) is Y ^ N(np, np(1 — p)). This 1s explained (together with the 
important continuity correction) in Section 13.5.5 of Modern Engineering Mathematics, 
and the approximation is used as follows: 


Рх < юй) e e( E038) 


ү[лр(1 - р)] 
Р(Х = Kk) = e (103-8 |- e(£-93-82 | 
\ҮЇлр(1-р)] Jinp(1 - p)] 


As a guide, the approximation can be used when n > 25 апа 0.1 < p < 0.9. 
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Figure 11.2 Table of 





the standard normal Z 00 01 02 03 .04 05 06 07 08 09 

cumulative distribution 

function D2). 0 5000 5040 5080 5120 5160 5199 35239 5279 5319 5359 
Л .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753 
2 5793 5832 5871 5910 5948 5987 .6026 .6064 .6103 .6141 
3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517 
4 ‚6554 ‚6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879 


.6915 .6950 .6985 .7019 .7054 .7088 .7123 7157 7190 ‚7224 
41257 7291 ‚7324 4357 .7389 .7422 .7454 .7486 7517 ‚7549 
‚7580 7611 7642 .7673 .7704 7734 7764 7794 ‚7823 ‚7852 
‚7881 ‚7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133 
.8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389 


оомо л 


1.0 8413 8438 8461 8485 8508 8531 8554 8577 8599 8621 
11 .8643 .8665 .8686 .8708 .8729 .8749 8770 8790 .8810 .8830 
1.2 .8849 .8869 .8888 .8907 .8925 8944 8962 8980 .8997 .9015 
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 9177 
1.4 ‚9192 ‚9207 9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319 


15 9332 9345 9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545 
1.7 .9554 .9564 ‚9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633 
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 
1.9 .9713 .9719 .9726 ‚9732 ‚9738 ‚9744 ‚9750 .9756 .9761 ‚9767 


2.0 9772 9778 9783 9788 9793 .9798 .9803 .9808 ‚9812 ‚9817 
2T .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857 
2.2 .9861 .9864 .9868 .9871 .9875 .9878 9881 9884 .9887 .9890 
2:3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 
24 .9918 .9920 .9922 .9925 ‚9927 ‚9929 9931 9932 ‚9934 ‚9936 


2.5 .9938 .9940 9941 9943 9945 .9946 .9948 .9949 .9951 .9952 
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964 
2.7 .9965 .9966 .9967 .9968 .9969 .9970 9971 9972 ‚9973 ‚9974 
2.8 ‚9974 .9975 .9976 ‚9977 9977 ‚9978 ‚9979 9979 9980 9981 
2.9 9981 9982 9982 9983 9984 9984 9985 9985 .9986 .9986 


3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990 
34 .9990 9991 9991 9991 9992 ‚9992 ‚9992 9992, .9993 .9993 
32 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995 
33 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997 
34 9997 9997 9997 9997 ‚9997 ‚9997 .9997 .9997 .9997 .9998 








z 1.282 1.645 1.960 2.326 2.576 3.090 3.291 3.891 4417 
Pe) .90 95 975 .99 995 999 9995 99995 .999995 
201- Ф(ә)] 20 10 05 02 01 .002 — .001 .000 1 .00001 


11.2.5 Ѕатріе теаѕигеѕ 


It is conventional to denote a random variable by an upper-case letter (X, say), and an 
actual observed value of it by the corresponding lower-case letter (x, say). An observed 
value x will be one of the set of possible values (sample space) for the random variable, 
which for a discrete random variable may be written as a list of the form {v,, v, v3, ... }. 
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11.3.1 


It is possible to observe a random variable many times (say n times) and obtain a series 
of values. In this case we assume that the random variable X refers to a population 
(whose characteristics may be unknown), and the series of random variables (.X;, X,, 

. , X,) as a sample. Each X; is assumed to have the characteristics of the population, 
so they all have the same distribution. The actual series of values {x,, x,,... , x,} 
consists of data upon which we can work, but it is useful to define certain sample 
measures in terms of the random variables (X;, X,,..., X,} in order to interpret the 
data. Principal among these measures are the sample average and sample variance, 
defined as 


o lx 21 a2 
xX=-YVX, S=- (X, -ï 
Ў, i X rp ) 
respectively, and it is useful to note that the sample variance is the average of the 
squares minus the square of the average: 
Sy- X - (Oy 


We shall also need the following alternative definition of sample variance in Section 11.3.5: 
1 « v2 
o Ea У Y,.-X 
X,n-1 n= 1 e ( i ) 


We can use the properties of means and variances (summarized in Section 11.2.2) to 
find the mean and variance of the sample average as follows: 


E(®) = TEX + EX LEGO) ‚..+ ЕХ] 


= "Б ak 


п 


Уа) = + Var(X, +... +X) = 4 [var(X,) +... + Мац] 
п п 


2 
nOx Oy 


= : 
Here we are assuming that the population mean and variance are L1, and o2 respectively 
(which may be unknown values in practice), and that the observations of the random 
variables X; are independent, a very important requirement in statistics. 


Estimating parameters 


Interval estimates and hypothesis tests 


The first step in statistics is to take some data from an experiment and make inferences 
about the values of certain parameters. Such parameters could be the mean and variance 
of a population, or the correlation between two variables for a population. The data are 
never sufficient to determine the values exactly, but two kinds of inferences can be made: 


11.3.2 
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(a) arange of values can be quoted, within which it is believed with high probability 
that the population parameter value lies, or 


(b) a decision can be made as to whether or not the data are compatible with a 
particular value of the parameter. 


The first of these is called interval estimation, and provides an assessment of the value 
that is rather more honest than merely quoting a single number derived from the sample 
data, which may be more or less uncertain depending upon the sample size. The second 
approach is called hypothesis testing and allows a value of particular interest to be 
assessed. These two approaches are usually covered in separate chapters in introductory 
textbooks on statistics, but they are closely related and are often used in conjunction 
with each other. Tests of simple hypotheses about parameter values will therefore be 
covered here within the context of interval estimation. 


Distribution of the sample average 


Suppose that a clearly identified population has a numerical characteristic with an 
unknown mean value, such as the mean lifetime for a kind of electronic component or 
the mean salary for a job category. A natural way to estimate this unknown mean is to 
take a sample from the population, measure the appropriate characteristic, and find the 
average value. If the sample size is n and the measured values are (xj, x5, . . . , x,) then 
the average value 


n 
2. | 
ШУУ 
nd 


is a reasonable estimate of the population mean Ly provided that the sample is repres- 
entative and independent, and the size n is sufficiently large. 
We can be more precise about how useful this estimate is if we treat the sample 


average as a random variable. Now we have a sample (X;, X5, .. . , X,) with average 
Х=\ЎХх, 
п 


and the mean and variance of X are given by 
2 


Е(Х)= ш, Мақ) = c 


(see Section 11.2.5). This shows that the expected value of the average is indeed equal 
to the population mean, and that the variance is smaller for larger samples. However, 
we can go further. The central limit theorem (Section 11.2.4) tells us that sums of ident- 
ical random variables tend to have a normal distribution regardless of the distribution 
of the variables themselves. The only requirement is that a sufficient number of variables 
contribute to the sum (the actual number required depends very much on the shape of 
the underlying distribution). 


The sample average is a sum of random variables, and therefore has (approximately) 
a normal distribution for a sufficiently large sample: 


Х= Мои» оўуп) 
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Example 11.1 


Solution 


Figure 11.3 Normal 
density function for 
Example 11.1. 


11.3.3 


This allows us to use a general method of inference concerning means instead of a 
separate method for each underlying distribution — even if this were known, which is 
usually not the case. In practice, a sample size of 25 or more is usually sufficient for the 
normal approximation. 


For all children taking an examination, the mean mark was 6096, with a standard 
deviation of 8%. A particular class of 30 children achieved an average of 63%. Is this 
unusual? 


The average of 63% is higher than the mean, but not by very much. We do not know the true 
distribution of marks, but the sample average has (approximately) a normal distribution. 
We can test the idea that this particular class result is a fluke by reducing the sample 
average to a standard normal in the manner described in Section 11.2.4 and checking 
its value against the table of the cumulative distribution function ®(z) (Figure 11.2): 


X-60. 63-60 


Р(Х 2 63)= Р > = 1 — &(2.054) = 0.020 
( ) [sn 8/30 ) (eom) 


It is unlikely (one chance in 50) that an average as high as this could occur by chance, 
assuming that the ability of the class is typical. Figure 11.3 illustrates that the result 
is towards the tail of the distribution. It therefore seems that this class is unusually 
successful. 





Confidence interval for the mean 


A useful notation will be introduced here. For the standard normal distribution, define 
z, to be the point on the z axis for which the area under the density function to its right 
is equal to o: 


Р(2 > 2,)= а 
or equivalently 


$()-1-a« 
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(a) (b) 


Figure 11.4 Normal density functions with (a) z, and (b) z,;;. 


Example 11.2 


(see Figure 11.4a). From the standard normal table we have Zp 9; = 1.645 and Zp 95 = 1.96. 
By symmetry 


Р( 205 < 2 < 205) = 1-а 


(see Figure 11.4b). Assuming normality of the sample average, we have 





Р{-гәл < Х-и < zan) -(1-0) 
Oyn 


which, after multiplying through the inequality by Oy/jn and changing the sign, gives 


О. o б; 
P( -zan Ž < px- Ž < zan 1) 1-а 
үп 


y 
so that 


px - Zan Z < He < + zan %) -1-« 

үп үп 
Assume for now that the standard deviation of X is known (it is actually very rare for 
Oy to be known when Ly is unknown, but we shall discuss this case first for simplicity 
and later consider the more general situation where both Uy and Oy are unknown). 


The interval defined by (X + z,,,0y/\n) is called a 100(1 — 0) confidence interval 
for the mean, with variance known. If a value for oris specified, the upper and lower 
limits of this interval can be calculated from the sample average. The probability is 
1 — a that the true mean lies between them. 


The temperature (in degrees Celsius) at ten points chosen at random in a large building 
is measured, giving the following list of readings: 


{18°, 16.5°, 17.5°, 18°, 19.5°, 16.5°, 18°, 17°, 19°, 17.5°} 


The standard deviation of temperature through the building is known from past 
experience to be 1°C. Find a 90% confidence interval for the mean temperature in the 
building. 
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Solution 


Figure 11.5 
Confidence intervals: 
(a) infinite interval; 
(b) finite interval; 
(c) point value. 


Example 11.3 


The average of the ten readings is 17.75 °C, and, using z; 4, = 1.645, the 90% confidence 
interval is 


(17.75 x 1.645(1//10)) — (17.1, 18.3) 


The confidence interval 1s used to indicate the degree of uncertainty in the sample 
average. The simplicity of the calculation is deceptive because the idea is very important 
and easily misunderstood. It is not the mean that is random but rather the interval that 
would enclose it 100(1 — œ)% of the times the experiment is performed. It is tempting 
to think of the interval as fixed by the experiment and the mean as a random variable that 
has a probability 1 — o of lying within it, but this is not correct. 

Typical values of o are 0.1, 0.05 and 0.01, giving 90%, 95% and 99% confidence 
intervals respectively. The value chosen is a compromise between truth and precision, 
as illustrated in Figure 11.5. A statement saying that the mean lies within the interval 
(79e, ee) is 10096 true (certain to be the case), but totally uninformative because of its 
total imprecision. None of the possible values is ruled out. On the other hand, saying 
that the mean equals the exact value given by the sample average is maximally precise, 
but again of limited value because the statement is false — or rather the probability of 
its truth is zero. A statement quoting a finite interval for the mean has a probability of 
being true, chosen to be quite high, and at the same time it rules out most of the possible 
values and therefore is highly informative. The higher the probability of truth, the lower 
the informativeness, and vice versa. 

The width of the interval also depends on the sample size. A larger experiment 
yields a more precise result. If figures for the confidence 1 — © апа precision (width of 
the interval) are specified in advance then the sample size can be chosen sufficiently 
large to satisfy these requirements. In some experimental situations (for example, 
destructive testing) there are incentives to keep sample sizes as small as possible. The 
experimenter must weigh up these conflicting objectives and design the experiment 
accordingly. 








(a) (b) (c) 


A machine fills cartons of liquid; the mean fill is adjustable but the dial on the gauge 
is not very accurate. The standard deviation of the quantity of fill is 6 ml. A sample of 
30 cartons gave a measured average content of 570 ml. Find 90% and 95% confidence 
intervals for the mean. 


Solution 


11.3.4 
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Using & — 0.05 and Zp; = 1.96, the 95% confidence interval is 
(570 + 1.960(6//30)) = (567.8, 572.1) 

Likewise, using & — 0.1 and z; 4, — 1.645, the 9096 confidence interval is 
(570 + 1.645(6//30)) = (568.2, 571.8) 


As expected, the 95% interval is slightly wider. 


Testing simple hypotheses 


As explained in Section 11.3.1, the testing of hypotheses about parameter values is 

complementary to the estimation process involving an interval. A ‘simple’ hypothesis 

is one that specifies a particular value for the parameter, as opposed to an interval, and 

it is this type that we shall consider. The following remarks apply generally to parameter 

hypothesis testing, but will be directed in particular to hypotheses concerning means. 
There are two kinds of errors that can occur when testing hypotheses: 


(a) a true hypothesis can be rejected (this is usually referred to as a type I error), or 
(b) a false hypothesis can be accepted (this is usually called a type II error). 


In reality, all hypotheses that prescribe particular values for parameters are false, but they 
may be approximately true and rejection may be the result of an experimental fluke. 
This is the sense in which a type I error can occur. Any hypothesis will be rejected if 
the sample size is large enough. Acceptance really means that there is insufficient 
evidence to reject the hypothesis, but this is not an entirely negative view because if the 
hypothesis has survived the test then it has some degree of dependability. 


Normally a simple hypothesis is tested by evaluating a test statistic, a quantity 
that depends upon the sample and leads to rejection of the hypothesized parameter 
value if its magnitude exceeds a certain threshold. If the hypothesized mean is Up) 
then the test statistic for the mean is 


7-2 h 
оу/үп 


with the hypothesis ‘rejected at significance level œ if |Z| > Zan 


The significance level can be regarded as the probability of false rejection, an error 
of type I. If the hypothesis is true then Z has a standard normal distribution and the 
probability that it will exceed z,,, in magnitude is a. If Z does exceed this value then 
either the hypothesis is wrong or else a rare event has occurred. It is easy to show 
that the test statistic lies on this threshold (for significance level o) exactly when 
the hypothesized value lies at one or other extreme of the 100(1 — a)% confidence 
interval (see Figure 11.6). An alternative way to test the hypothesis is therefore to see 
whether or not the value lies within the confidence interval. 
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Figure 11.6 
Confidence interval 
and hypothesis test. 


Example 11.4 


Solution 


11.3.5 


ABS 


A 
i 


i 
' 
i 
1 
í 1 


Confidence interval 
1 
1 


Acceptance interval for 4 


For the situation described in Example 11.3 test the hypothesis that the mean fill of liquid 
is 568 ml (one imperial pint). 


The value of the test statistic is 


z= 510-568 _ 5 94 


6//30 

This exceeds zj,, — 1.645 (10% significance), but is less than Zp; = 1.96 (5% signi- 
ficance). Alternatively, the quoted figure lies within the 95% confidence interval but 
outside the 90% confidence interval. Either way, the hypothesis is rejected at the 10% 
significance level but accepted at the 5% level. If the actual mean is 568 ml then there 
is less than one chance in 10 (but more than one in 20) that a result as extreme as 570 ml 
will be obtained. It looks as though the true mean is larger than the intended value, but 
the evidence is not particularly strong. The probability of false rejection (type I error) 
is somewhere between 5% and 10%, which is small but not negligible. 


Examples 11.3 and 11.4 set the pattern for the interpretation and use of confidence 
intervals. We shall now see how to apply these ideas more generally. 


Other confidence intervals and tests concerning means 


Mean when variance is unknown 


With the basic ideas of interval estimation and hypothesis testing established, it is 
relatively easy to cover other cases. The first and most obvious is to remove the assump- 
tion that the variance is known. If the sample size is large then there is essentially no 
problem, because the sample standard deviation Sy , can be used in place of oy in the 
confidence interval, where 


Figure 11.7 Density 
functions of T, and z. 
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Standard normal 





This definition was introduced in Section 11.2.5. Note that the sum is divided by n — 1 
rather than n. For a large sample this makes little difference, but for a small sample this 
form must be used because the *t distribution" requires it. 

Suppose that the sample size is small, say less than 25. Using S,, , in place of oy 
adds an extra uncertainty because this estimate 1s itself subject to error. Furthermore, 
the central limit theorem cannot be relied upon to ensure that the sample average has a 
normal distribution. We have to assume that the data themselves are normal. In this 
situation the random variable 


T,- X- Их 
Sy hn 
has a t distribution with parameter n — 1. This distribution resembles the normal dis- 
tribution, as can be seen in Figure 11.7, which shows the density functions of T, and 
T; together with that of the standard normal distribution. In fact T, tends to the stand- 
ard normal distribution as n — ee. The parameter of the ¢ distribution (whose value 
here is one less than the size of the sample) is usually called the number of degrees 
of freedom. 


Defining ten- DY 
PIT, E lana) =a 


(by analogy with z,,), we can derive a 100(1 — a@)% confidence interval for the mean 
by the method used in Section 11.3.3: 


D Sy. 
(x £ 10028-1 MA ) 
үп 


This takes explicit account of the uncertainty caused by the use of S, , in place of 
Oy. Values of t,, , for typical values of o can be read directly from the table of the 
t distribution, an example of which is shown in Figure 11.8. To obtain a test statistic 
for an assumed mean LU, simply replace [ly by ц, іп ће definition of 7,. 
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Figure 11.8 Table 
of the ¢ distribution t. РВ n 0 — 0.10 œ= 0.05 а = 0.025 а= 0.01 œ = 0.005 n 


(Based on Table 12 





of Biometrika Tables 1 3.078 6.314 12.706 31.821 63.657 1 

for Statisticians, 2 1.886 2.920 4.303 6.965 9.925 2 

Volume 1. Cambridge 3 1.638 2.353 3.182 4.541 5.841 3 

University Press, 1954. 4 1.533 2.132 2.776 3.747 4.604 4 

By permission of the 5 1.476 2.015 2.571 3.365 4.032 5 
Biometrika trustees.) 

6 1.440 1.943 2.447 3.143 3.707 6 

7 1.415 1.895 2.365 2.998 3.499 7 

8 1.397 1.860 2.306 2.896 3.355 8 

9 1.383 1.833 2.262 2.821 3.250 9 

10 1.372 1.812 2.228 2.764 3.169 10 

11 1.363 1.796 2.201 2.718 3.106 11 

12 1.356 1.782 2.179 2.681 3.055 12 

13 1.350 1.771 2.160 2.650 3.012 13 

14 1.345 1.761 2.145 2.624 2.977 14 

15 1.341 1.753 2.131 2.602 2.947 15 

16 1.337 1.746 2.120 2.583 2.921 16 

17 1.333 1.740 2.110 2.567 2.898 17 

18 1.330 1.734 2.101 2.552 2.878 18 

19 1.328 1.729 2.093 2.539 2.861 19 

20 1.325 1:725 2.086 2.528 2.845 20 

21 1.323 1.721 2.080 2.518 2.831 21 

22 1.321 1.717 2.074 2.508 2.819 22 

23 1.319 1.714 2.069 2.500 2.807 23 

24 1.318 1.711 2.064 2.492 2.797 24 

25 1.316 1.708 2.060 2.485 2.787 25 

26 1.315 1.706 2.056 2.479 2.779 26 

27 1.314 1.703 2.052 2.473 2.771 27 

28 1.313 1.701 2.048 2.467 2.763 28 

29 1.311 1.699 2.045 2.462 2.756 29 

оо 1.282 1.645 1.960 2.326 2.576 оо 


Example 11.5 . The measured lifetimes of a sample of 20 electronic components gave an average of 
1250h, with a sample standard deviation of 96 h. Assuming that the lifetime has a normal 
distribution, find a 95% confidence interval for the mean lifetime of the population, and 
test the hypothesis that the mean is 1300h. 


Solution The appropriate figure from the ¢ table is to 025,19 = 2.093, so the 95% confidence interval is 
(1250 + 2.093(96)//20) = (1205, 1295) 


The claim that the mean lifetime is 1300h is therefore rejected at the 5% significance 
level. The same conclusion is reached by evaluating 


gf = 125021300... 545 


" 96/120 
which exceeds fp 975,19 in magnitude. 


Example 11.6 


Solution 
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Difference between means 


Now suppose that we have not just a single sample but two samples from different popula- 
tions, and that we wish to compare the separate means. Assume also that the variances 
of the two populations are equal but unknown (the most common situation). Then it can 
be shown that the 100(1 — o)96 confidence interval for the difference 4, — Li between 
the means is 


T f TRIN 
[s -X ЕЕ aS (2 F 4) 
1 2 


where X, and X, are the respective sample averages, n, and n are the respective 
sample sizes, Sj and S3 are the respective sample variances (using the ‘n — 1’ form 
as above), 


с mz DSTI Qu - 15 
А Ло) 


is a pooled estimate of the unknown variance, and 
n= nny — 2 


is the parameter for the ¢ table. The corresponding test statistic for an assumed 
difference dy = Lt, — Lb 15 

T = Xi Z X, э. dy 
"o Sn, * 1/n;) 


For small samples the populations have to be normal, but for larger samples this 1s not 
required and the :-table figure can be replaced by z,,,.. 


Two kinds ofa new plastic material are to be compared for strength. From tensile strength 
measurements of 10 similar pieces of each type, the sample averages and standard 
deviations were as follows: 


Х,= 78.3, S,=5.6, X,=84.2, 5,-63 


Compare the mean strengths, assuming normal data. 


The pooled estimate of the standard deviation is 5.960, the ¢ table gives f 99513 = 2.101, 
and the 95% confidence interval for the difference between means is 


(78.3 — 84.2 + 2.101(5.96)//5) = (-11.5, —0.3) 


The difference is significant at the 5% level because zero does not lie within the interval. 
Also, assuming zero difference gives 


т. = 183-842 _ _» 9) 


" 5.96//5 
which confirms the 596 significance. 
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11.3.6 


It is also possible to set up confidence intervals and tests for the variance 07, or for 
comparing two variances for different populations. The process of testing means and 
variances within and between several populations is called analysis of variance. This 
has many applications, and is well covered in statistics textbooks. 


Interval and test for proportion 


The ideas of interval estimation do not just apply to means. If probability 1s interpreted 
as a long-term proportion (which is one of the common interpretations) then measuring 
a sample proportion is a way of estimating a probability. The binomial distribution 
(Section 11.2.3) points the way. We count the number of ‘successes’, say X, in n ‘trials’, 
and estimate the probability p of success at each trial, or the long-term proportion, by 
the sample proportion 


(it is common in statistics to place the ‘hat’? symbol ^ over a parameter to denote an 
estimate of that parameter). This only provides a point estimate. To obtain a confidence 
interval, we can exploit the normal approximation to the binomial (Section 11.2.4) 


X N(np, np(1 — p)) 


approximately, for large n. Dividing by n preserves normality, so 
l- 
= мр. - 2) 


Following the argument in Section 11.3.3, we have 


рза 0280 es fa =1-@ 
n n 


and, after rearranging the inequality, 


P ЕЯ <р Еа =1-0 


Because p is unknown, we have to make a further approximation by replacing p бу р 
inside the square root, to give an approximate 100(1 — œ)% confidence interval for p: 


jn ft 


The corresponding test statistic for an assumed proportion p, is 





" Х- пр 
vEnpoCl 7 po)] 


with +z, as the rejection points for significance level o. 


Example 11.7 


Solution 


Example 11.8 


Solution 
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In an opinion poll conducted with a sample of 1000 people chosen at random, 30% said 
that they support a certain political party. Find a 95% confidence interval for the actual 
proportion of the population who support this party. 


The required confidence interval is obtained directly as 


03 + I с = (0.27, 0.33) 


A variation of about 3% either way is therefore to be expected when conducting opinion 
polls with sample sizes of this order, which is fairly typical, and this figure is often 
quoted in the news media as an indication of maximum likely error. 


A similar argument that also exploits the fact that the difference between two 
independent normal random variables is also normal leads to the following 100(1 — a)% 
confidence interval for the difference between two proportions, when р, апа р, аге 
the respective sample proportions: 


2 = = р (1-р 
ЕЕ Pi) , Bol Ёз) 


ni n, 


Again it is assumed that n, and n, are reasonably large. The test statistic for equality 
of proportions is 
Z= Pi a D» 
VPC. - B)C1/n,  1/n;)] 
where p = (Ху + X;y/(n; * n;) is a pooled estimate of the proportion. 


One hundred samples of an alloy are tested for resistance to fatigue. Half have been 
prepared using a new process and the other half by a standard process. Of those pre- 
pared by the new process, 35 exhibit good fatigue resistance, whereas only 25 of those 
prepared in the standard way show the same performance. Is the new process better 
than the standard one? 


The proportions of good samples are 0.7 for the new process and 0.5 for the standard 
one, so a 9596 confidence interval for the difference between the true proportions is 


0.7-05+ NI 7092 + ш = (0.01, 0.39) 


The pooled estimate of proportion is 


р = (35 + 25)/(50 + 50) = 0.6 
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so that 


Lord Us. euin 
4£(0.6)(0.4)/25] 


Both approaches show that the difference is significant at the 5% level. However, it is 
only just so: if one more sample for the new process had been less fatigue-resistant, the 
difference would not have been significant at this level. This suggests that the new process 
is effective — but, despite the apparently large difference in success rates, the evidence 
is not very strong. 


This method only applies to independent sample proportions. It would not be legitim- 
ate to apply it, for instance, to a more elaborate version of the opinion poll (Example 11.7) 
in which respondents can choose between two (or more) political parties or else support 
neither. Support for one party usually precludes support for another, so the proportions 
of those interviewed who support the two parties are not independent. More elaborate 
confidence intervals, based on the multinomial distribution, can handle such situations. 
This shows how important it is to understand the assumptions upon which statistical 
methods are based. It would be very easy to look up ‘difference between proportions’ 
in an index and apply an inappropriate formula. 


11.3.7 Exercises 


An electrical firm manufactures light bulbs whose 
lifetime is approximately normally distributed with 
a standard deviation of 50 h. 


(a) If a sample of 30 bulbs has an average life of 
7780 h, find a 9596 confidence interval for the 
mean lifetime of the population. 

(b) How large a sample is needed if we wish to be 
95% confident that our sample average will be 
within 10h of the population mean? 


Monthly rainfall measurements (in mm) were taken 
at a certain location for three years, with results 
as follows: 


38 48 50 94 105 53 81 91 110 103 90 84 
115 113 35 130 77 67 72 113 98 37 61 91 
9 112 29 16 56 61 82 132 48 68 114 55 


Find the average monthly rainfall for this period. 
Also find a 95% confidence interval for the mean 
monthly rainfall, using the measured standard 
deviation as an estimate of the true value. 


Quantities of a trace impurity in 12 specimens of a 
new material are measured (in parts per million) 
as follows: 


8.8, 7.1, 7.9, 10.2, 8.9, 7.7, 10.6, 9.4, 9.2, 7.5, 
9.0, 8.4 


Find a 95% confidence interval for the population 
mean, assuming that the distribution is normal. 


A sample of 30 pieces of a semiconductor material 
gave an average resistivity of 73.2 mQ m, with a 
sample standard deviation of 5.4mQ m. Obtain 

a 95% confidence interval for the resistivity of 
the material, and test the hypothesis that this is 
75mQ m. 


The mean weight loss of 16 grinding balls after a 
certain length of time in mill slurry is 3.42 g, with 
a standard deviation of 0.68 g. Construct а 99% 
confidence interval for the true mean weight 

loss of such grinding balls under the stated 
conditions. 


While performing a certain task under simulated 
weightlessness, the pulse rate of 32 astronaut 
trainees increased on the average by 26.4 beats per 
minute, with a standard deviation of 4.28 beats per 
minute. Construct a 9596 confidence interval for the 
true average increase in the pulse rate of astronaut 
trainees performing the given task. 
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The quality of a liquid being used in an etching 10 
process is monitored automatically by measuring 

the attenuation of a certain wavelength of light 

passing through it. The criterion is that when the 
attenuation reaches 58%, the liquid is declared as 

‘spent’. Ten samples of the liquid are used until 

they are judged as ‘spent’ by the experts. The light 
attenuation is then measured, and gives an average 

result of 56%, with a standard deviation of 3%. ИТІ 
Is the criterion satisfactory? 


A fleet car company has to decide between 

two brands A and B of tyre for its cars. An 
experiment is conducted using 12 of each brand, 
run until they wear out. The sample averages 
and standard deviations of running distance 

(in km) are respectively 36 300 and 5000 for A, 
and 39 100 and 6100 for B. Obtain a 95% 
confidence interval for the difference in means, 12 
assuming the distributions to be normal, and test 
the hypothesis that brand B tyres outrun brand 
A tyres. 


A manufacturer claims that the lifetime of a 
particular electronic component is unaffected by 
temperature variations within the range 0—60 °C. 
Two samples of these components were tested, 
and their measured lifetimes (in hours) recorded 
as follows: 


13 
0°C: 7250, 6970, 7370, 7910, 6790, 6850, 
7280, 7830 


60°C: 7030, 7270, 6510, 6700, 7350, 6770, 
6220, 7230 


Assuming that the lifetimes have a normal 
distribution, find 90% and 95% confidence 
intervals for the difference between the mean 
lifetimes at the two temperatures, and hence test 
the manufacturer’s claim at the 5% and 10% 
significance levels. 


NE 


Suppose that out of 540 drivers tested at random, 38 
were found to have consumed more than the legal 
limit of alcohol. Find 90% and 95% confidence 
intervals for the true proportion of drivers who were 
over the limit during the time of the tests. Are the 
results compatible with the hypothesis that this 
proportion is less than 5%? 


It is known that approximately one-quarter of 

all houses in a certain area have inadequate loft 
insulation. How many houses should be inspected 
if the difference between the estimated and true 
proportions having inadequate loft insulation is 
not to exceed 0.05, with probability 90%? If in fact 
200 houses are inspected, and 55 of them have 
inadequate loft insulation, find a 90% confidence 
interval for the true proportion. 


A drug-manufacturer claims that the proportion 
of patients exhibiting side-effects to their new 
anti-arthritis drug is at least 8% lower than 

for the standard brand X. In a controlled 
experiment 31 out of 100 patients receiving 

the new drug exhibited side-effects, as did 74 
out of 150 patients receiving brand X. Test the 
manufacturer’s claim using 90% and 95% 
confidence intervals. 


Suppose that 10 years ago 500 people were 
working in a factory, and 180 of them were 
exposed to a material which is now suspected 
as being carcinogenic. Of those 180, 30 have 
since developed cancer, whereas 32 of the other 
workers (who were not exposed) have also since 
developed cancer. Obtain a 95% confidence 
interval for the difference between the 
proportions with cancer among those exposed 
and not exposed, and assess whether the material 
should be considered carcinogenic, on this 
evidence. 


Joint distributions and correlation 


Just as it is possible for events to be dependent upon one another in that information to 
the effect that one has occurred changes the probability of the other, so it is possible for 
random variables to be associated in value. In this section we show how the degree of 
dependence between two random variables can be defined and measured. 
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11.4.1 


Example 11.9 


Solution 


Figure 11.9 
Joint distribution 
for Example 11.9. 


Joint and marginal distributions 


The idea that two variables, each of which is random, can be associated in some way 
might seem mysterious at first, but can be clarified with some familiar examples. For 
instance, if one chooses a person at random and measures his or her height and weight, 
each measurement is a random variable — but we know that taller people also tend to 
be heavier than shorter people, so the outcomes will be related. On the other hand, a 
person’s birthday and telephone number are not likely to be related in any way. In 
general, we need a measure of the simultaneous distribution of two random variables. 


For two discrete random variables X and Y with possible values {u,,..., umy and 
{v,,..., V,} respectively, the joint distribution of X and Y is the set of all joint 
probabilities of the form 


ОЕК (ЕАУ ИЕ И) 


The joint distribution contains all relevant information about the random variables 
separately, as well as their joint behaviour. To obtain the distribution of one variable, 
we sum over the possible values of the other: 


Р(Х= щ) = ЎРО шп Y=v) (k=1,...,m) 


D 


PUY =u) = Y PX-un Y=v) (j=1,...,n) 


k=1 


The distributions obtained in this way are called marginal distributions of X and Y. 


Two textbooks are selected at random from a shelf containing three statistics texts, two 
mathematics texts and three engineering texts. Denoting the number of books selected 
in each subject by S, M and E respectively, find (a) the joint distribution of S and M, 
and (b) the marginal distributions of S, M and E. 





(а) 
M 
S 0 1 2 Total 
3 3 1 
0 Ж ч 38 ia 
9 a 15 
1 8 m 28 
2 3 
2 28 28 
Total B з a 1 


The joint distribution (shown in Figure 11.9) is built up element by element using 
the addition and product rules of probability as follows: 


P(S=M=0)=PE=2)=(2)G)=3 


28 
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that is, the probability that the first book is an engineering text (three chances out 
of eight) times the probability that the second book is also (two remaining chances 
out of seven). Continuing, 


P(S=0N M=1)=PM=1N E=1) 
ODDS 
that is, the probability that the first book is a mathematics text and the second an 


engineering text, plus the (equal) probability of the books being the other way 
round. The other probabilities are derived similarly. 








(b) The marginal distributions of S and M are just the row and column totals as shown 
in Figure 11.9. The marginal distribution of E can also be derived from the table: 


Р(Е = 2)= Р(8= М=0)= + 
Р(Е = 1) = Р(5=1 П М= 0) + Р($=0 0 М= 1) = 5 
Р(Е = 0) = Р(5 = 2) + Р(5=1 1 М= 1) + Р(М= 2) = 2 


This is the same as the marginal distribution of S, which is not surprising, because 
there are the same numbers of engineering and statistics books on the shelf. 


In order to apply these ideas of joint and marginal distributions to continuous random 
variables, we need to build on the interpretation of the probability density function. 
The joint density function of two continuous random variables X and Y, denoted by 
Луб, у), 15 such that 


PS) {fee 
BRO Ae an 3m «m «Se | | fx y Go. y) dy dx 
= 
for all intervals (x,, x;) and ( y;, y;). This involves a double integral over the two variables 
x and y. This is necessary because the joint density function must indicate the relative 
likelihood of every combination of values of X and Y, just as the joint distribution 
does for discrete random variables. The joint density function is transformed into a 
probability by integrating over an interval for both variables. The double integral here 
can be regarded as a pair of single-variable integrations, with the outer variable (x) held 
constant during the integration with respect to the inner variable ( y). In fact the same 
answer is obtained if the integration 1s performed the other way around. 


The marginal density functions for X and Y are obtained from the joint density 


function in a manner analogous to the discrete case: by integrating over all values 
of the unwanted variable: 


Ix) = | Љу у) ау (=> < x< ©) 


Sy(y) = | Tey (x, y)dx (=œ < y< %) 
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Example 11.10 


Solution 





Figure 11.10 
Density function for 
Example 11.10. 


11.4.2 


The joint density function of random variables X and Y is 


1 (0OS$x<1l,ex<yScx+l) 
fen 4 


0 otherwise 


where c is a constant such that 0 « c « 1 (which means that f, ,(x, y) is unity over the 
trapezoidal area shown in Figure 11.10 and zero elsewhere). Find the marginal distribu- 
tions of X and Y. Also find the probability that neither X nor Y exceeds one-half, 
assuming c = 1. 


To find the marginal distribution of X, we integrate with respect to y: 


сх+1 
ау =1 (0 =х%1 
о) = |. go a 
0 otherwise 


The marginal distribution for Y is rather more complicated. Integrating with respect to 
x and assuming that 0 < c x 1, 


ieityei (ym das) 
C 


fy) 7141 (c 1) 


Sys 
: (0 € y « c) 
с 


(Exercise 16). When c = 0, the marginal distribution for Y is the same as that for X. 
Finally, when c = 1, 


1/2 f1/2 12 
Р(Х = 1 and т=р= | | tavas =| (L-x)dx=; 


0 x 0 


Here the inner integral (with respect to y) is performed with x treated as constant, and 
the resulting function of x is integrated to give the answer. 


The definitions of joint and marginal distributions can be extended to any number of 
random variables. 


Independence 


The idea of independence of events can be extended to random variables to give us the 
important case in which no information is shared between them. This is important in 
experiments where essentially the same quantity is measured repeatedly, either within a 
single experiment involving repetition or between different experiments. As mentioned 
before, independence within a sample is one of the properties that qualifies the sample 
for analysis and conclusion. 
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Example 11.11 


Solution 


11.4.3 


Two random variables X and Y are called independent if their joint distribution 
factorizes into the product of their marginal distributions: 


Р(Х= и П Ү= 0) = Р(Х= и)Р(Ү= 0) in the discrete case 








fxy Oo y) 2 faeQo fy Cy) in the continuous case 


For example, the random variables X and Y in Example 11.10 are independent if and 
only if c — 0. 


The assembly of a complex piece of equipment can be divided into two stages. The 
times (in hours) required for the two stages are random variables (X and Y, say) with 
density functions e* and 2 e ?' respectively. Assuming that the stage assembly times are 
independent, find the probability that the assembly will be completed within four hours. 


The assumption of independence implies that 


Жу(х, у) 2 feos (y) 22e €? 
If the time for the first stage is x, the total time will not exceed four hours if 
Ү<4-х 


so the required value is 


4 f4-x 4 [4-x 
P(X+Y<4)= | | fer(x, y)dydx = | | 29 "9 dpdx 


070 070 


4 
- | (e*-e °”) dx = 0.964 


0 


Where random variables are dependent upon one another, it is possible to express 
this dependence by defining a conditional distribution analogous to conditional 
probability, in terms of the joint distribution (or density function) and the marginal dis- 
tributions. Instead of pursuing this idea here, we shall consider a numerical measure of 
dependence that can be estimated from sample data. 


Covariance and correlation 


The use of mean and variance for a random variable is motivated partly by the difficulty 
in determining the full probability distribution in many practical cases. The joint dis- 
tribution of two variables presents even greater difficulties. Since we already have 
numerical measures of location and dispersion for the variables individually, it seems 
reasonable to define a measure of association of the two variables that is independent 
of their separate means and variances so that the new measure provides essentially new 
information about the variables. 

There are four objectives that it seems reasonable for such a measure to satisfy. Its 
value should 
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(a) be zero for independent variables, 
(b) be non-zero for dependent variables, 


(c) indicate the degree of dependence in some well-defined sense, detached from the 
individual means and variances, 


(d) be easy to estimate from sample data. 


It is actually rather difficult to satisfy all of these, but the most popular measure of 
association gets most of the way. 


The covariance of random variables X and Y, denoted by Cov(X, Y ), is defined as 
Соу(Х, У) = E((X - uxXY — uy)j 


m n 


Y, Y, Gu - uo - АРОК = щ П Ү= 0) 


k=1 j=l 


| | (x - ux)CQ? - uy) fy, y Go, y) dx dy 


for discrete and continuous variables respectively. The correlation py yis the covariance 
divided by the product of the standard deviations: 


EYS 
ру OxOy 


If whenever the random variable X is larger than its mean the random variable Y also tends 
to be larger than its mean then the product (X — L)(Y — uy) will tend to be positive. 
The same will be true if both variables tend to be smaller than their means simultan- 
eously. The covariance is then positive. A negative covariance implies that the variables 
tend to move in opposite directions with respect to their means. Both covariance and 
correlation therefore measure association relative to the mean values of the variables. 
It turns out that correlation measures association relative to the standard deviations 
as well. 

It should be noted that the variance of a random variable X is the same as the 
covariance with itself: 


Var( X) 2 Cov(X, X) 


Also, by expanding the product within the integral or sum in the definition of covariance, 
it is easy to show that an alternative expression is 


Cov(X, У) = E(XY) - E(X)E(Y) 


Although the sign of the covariance indicates the direction of the dependence, its 
magnitude depends not only on the degree of dependence but also upon the variances 
of the random variables, so it fails to satisfy the objective (c). In contrast, the correlation 
is limited in range 


-1S pyy S41 


Example 11.12 


Solution 


Example 11.13 


Solution 
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and it adopts the limiting values of this range only when the random variables are 
linearly related: 


Pxy- Xl ifand only if there exist a, b such that Y - aX t b 


(this is proved in most textbooks on probability theory). The magnitude of the correlation 
indicates the degree of linear relationship, so that objective (c) 1s satisfied. 


Find the correlation of the random variables S and M in Example 11.9. 


The joint and marginal distributions of S and M are shown in Figure 11.9. First we find 
the expected values of S and S° from the marginal distribution, and hence the variance 
and standard deviation: 


2 
E(S)=5+(2))=4, — ES)- Ee (D -Z 
2 
Var(S) = 2 - 2) 
from which 
оу = 335 
Next we do the same for M: 
2. 
EM)-i-Qu-L (М?) =2+(4)00) =: 
Vara) 21-178 


from which 


1 
V7 


Nive 


Ou = 
All products of S and M are zero except when both are equal to one, so the expected 
value of the product is 

E(SM) = 3 


The correlation now follows easily: 


_ E(SM) - E(S)E(M) _ #7 GG) 1d 
Рэм = oM —-— 


OsOu (2435) G4) J5 


The correlation is negative because if there are more statistics books in the selection 
then there will tend to be fewer mathematics books, and vice versa. 


Find the correlation of the random variables X and Y in Example 11.10. 


Proceeding as in Example 11.12, we have for X 


so- | xdx-i 


0 


1 
Е(Х?) = | хах =} 


0 
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so that Var(X) 2 E(X?) - [E(X)P — 4. Also, for Y 


É 2 1 1+с 
ко)= | Zo [vo 1-10-09 
с 1 


0 


= 5(1 +c) after simplification 


e 3 1 ite 
к= | “+ | 2 y[r-io- 56 
1 


0 c 


l(1-c)-ic after simplification 
3 2 p 


so that Var(Y) 2 E(Y?) - [E(Y)P 2 z (1 4 c)). For the expected value of the product 
we have 


1 сх+1 1 
вот | | »®ах=!| x(1- 2ex)dx 2 1 ic 


0 J cx 0 


Finally, the correlation between X and Y is 


. E(XY) - E(X)E(Y) 


Ра Var (X) Var(Y)] 
_atse7 (ite) _ o 
LV +e’) (1 +e’) 


Note that in fact the result of Example 11.13 holds for any value of c, and not just for 
the range 0 < c < 1 assumed in Example 11.10. As the value of c increases (positive 
or negative), the correlation increases also, but its magnitude never exceeds one. It is 
also clear that if X and Y are independent then c = 0 and the correlation is zero. Refer 
to Figure 11.10 for a geometrical interpretation. When c = 0, the sample space is a 
square within which all points are equally likely, so there is no association between 
the variables. As c increases (positive or negative), the sample space becomes more 
elongated as the variables become more tightly coupled to one another. 


The general relationship between independence and correlation is expressed as 
follows: if the random variables X and Y are independent then their correlation is zero. 
This 1s easily shown as follows for continuous random variables (or by a similar argu- 
ment for discrete random variables). First we have 


fy Gn y) — fex) fy Cy) 


and then 


| | (х — ux) Gy - uy) fx) fyCy) dx dy 


= | (х= рх) (х) a| Cv - uy) fY(y) dy 9 Qty — ux)y — uy) = 0 


11.4.4 
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Unfortunately, the converse does not hold: zero correlation does not imply independence. 
In general, correlation is a measure of linear dependence, and may be zero or very small 
for variables that are dependent in a nonlinear way (see Exercise 15). Objective (a) is 
satisfied, therefore, but not objective (b) in general. 

Another problem with correlation 1s that a non-zero value does not imply the pres- 
ence of a causal relationship between the variables or the phenomena that they measure. 
Correlation can be 'spurious', deriving from some third variable that may be unrecog- 
nized at the time. For example, among the economic statistics that are gathered together 
from many countries and published, there are figures for the number of telephones per 
head of population, birth rate, and the gross domestic product per capita (GDP). It turns 
out that there is a large negative correlation between number of telephones per head and 
the birth rate, but no-one would suggest that telephones have any direct application in 
birth control. The GDP is a measure of wealth, and there is a large positive correlation 
between this and the number of telephones per head, and a large negative correlation 
between GDP and birth rate, both for quite genuine reasons. The correlation between 
telephones and birth rate is therefore spurious, and a more sophisticated measure called 
the partial correlation can be used to eliminate the third variable (provided that it is 
recognized and measured). 

We have considered all the objectives except (d); that this is satisfied is shown in 
Section 11.4.4. 


Sample correlation 


There are two kinds of situations where we take samples of values of two random 
variables X and Y. First we might be interested in the same property for two different 
populations. Perhaps there is evidence that the mean values are different, so we take 
samples of each and compare them. This situation was discussed in Section 11.3.5. The 
second kind involves two different properties for the same population. It 1s to this situ- 
ation that correlation applies. We take a single sample from the population and measure 


the pair of random variables (X;, Y;) for each i2 1,..., n. 
For a sample ((Xi, Yi), ..., (X,, Y,)) the sample correlation coefficient is 
defined as 


1 2 = 
т. 
ak E аа: 


SI 


Like the true correlation, the sample correlation is limited in value to the range [-1, 1] 
and ry y= +1 when (and only when) all of the points lie along a line. Figure 11.11 contains 
four typical scatter diagrams of samples plotted on the (x, y) plane, with an indication 
of the correlation for each one. The range of behaviour is shown from independence 
(a) through imperfect correlation (b) and (c) to a perfect linear relationship (d). 

By expanding the product within the outer bracket in the numerator, it is easy to 
show that an alternative expression is 


XY - (X)(Y) 
Sy Sy 


Fyy-— 


934 APPLIED PROBABILITY AND STATISTICS 





(c) (d) 
Figure 11.11 Scatter plots for two random variables: (a) pxy — 0; (b) pxy 0; (c) pxy < 0; (4) руу = 1. 


Figure 11.12 РОТИ t 1 lati 
Pseudocode listing for i Program to compute sample correlation. 


sample correlation. x(k) and y(k) are the arrays of data, 
n is the sample size, 
xbar and ybar are the sample averages, 
sx and sy are the sample standard deviations, 
rxy is the sample correlation, 
Mx, My, Qx, Qy, Oxy hold running totals. } 


Мх «< 0; My —0 
Qx —0; Qy — 0 
Qxy — 0 
fork is 1 to n do 
ах < x(k) — Mx 
Фу < у(К) – Му 
Mx € ((k — 1)*Mx + x(k))/k 
Му < ((k — 1)*My + y(k))/k 
Ох < Ох + (1 — I/k)*diffx*diffx 
Оу < Оу + (1 — Ik) *diffy*diffy 
Оху < Оху + (1 — 1/k)*diffx*diffy 
endfor 
xbar «— Mx; ybar — Му 
sx €- sqr(Qx/n); sy € sqr(Qy/n) 
rxy < Qxy/(n*sx*sy) 








which 15 useful for hand calculation. For computer calculation the best method involves 


the successive sums of products: 


Ovy x = Y (X; Е My J; = My 


ї=1 


where 


ix ix 
My; = УХ, My, = D 
і=1 і=1 


and then ry, — Oyy,,/nSySy. The pseudocode listing in Figure 11.12 exploits the recur- 


rence relation 


Osa 7 Oa (1- 1) 0 - M00 Мо) 


which allows for a single pass through the data with no loss of accuracy. 


Example 11.14 


Solution 


11.4.5 


Example 11.15 


11.4 JOINT DISTRIBUTIONS AND CORRELATION 935 


A material used in the construction industry contains an impurity suspected of having 
an adverse effect upon the material’s performance in resisting long-term operational 
stresses. Percentages of impurity and performance indexes for 22 specimens of this 
material are as follows: 





% Impurity X; 44 55 42 30 45 49 46 50 47 5.1 4.4 
Performance Y; | 12 14 18 35 23 29 16 12 18 21 27 





% Impurity X, | 41 49 47 50 46 36 49 51 48 52 52 
Performance Y, |13 19 22 20 16 27 201 13 18 17 dH 





Find the sample correlation coefficient. 


The following quantities are easily obtained from the data: 
Х = 4.6545, 5;= 0.55081, Ү= 19.1818, 5;= 6.0350, ХҮ = 87.3591 


(Note that it is advisable to record these results to several significant digits in order to avoid 
losing precision when calculating the difference within the numerator of ryy.) The 
sample correlation is then ry y = —0.58, the negative value suggesting that the impurity 
has an adverse effect upon performance. It remains to be seen whether this is statistic- 
ally significant. 


Interval and test for correlation 


Correlation is more difficult to deal with than mean and proportion, but for normal 
random variables X and Y with a true correlation p, y the sample statistic 


T= E 23) fn CU +ryy)C = Pry) 
(1 -ryy)(1 + xy) 


is approximately standard normal for large n. This can be used directly as a test 
statistic for an assumed value of p, y. Alternatively, an approximate 100(1 — a)% 
confidence interval for Py y can be derived: 


(eln Ltr- e) 


I*rcrc(l-r) 1+r+(1-r)/c 


where 


c= exp 2242 
Vn - 3) 


(the subscripts X and Y have been dropped from ry y in this formula). 





For the data in Example 11.14 find 95% and 99% confidence intervals for the true cor- 
relation between percentage of impurity and performance index, and test the hypothesis 
that these are independent. 
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Solution 


11.4.6 


The sample correlation (from the 22 specimens) was found in Example 11.14 to be 
—0.58. For the 95% confidence interval the constant c = 2.458 and the interval itself is 
(—0.80, —0.21). Similarly, the 99% confidence interval is (—0.85, —0.07). Assuming 
Pxy = 0, the value of the test statistic is Z = —2.89, which exceeds Zp 95 = 2.576 in 
magnitude. Either way, we can be more than 99% confident that the impurity has an 
adverse effect upon performance. 


Rank correlation 


As has been previously emphasized the correlation only works as a measure of depend- 
ence if 

(1) nis reasonably large, 

(2  Xand Y are numerical characteristics, 

(3) the dependence is /inear, and 

(4) Xand Y each have a normal distribution. 

There is an alternative form of sample correlation, which has greater applicability, 
requiring only that 

(1) nis reasonably large, 

(2  Xand Y are rankable characteristics, and 

(3) the dependence is monotonic (that is, always in the same direction, which may be 


forward or inverse, but not necessarily linear). 


The variables X and Y can have any distribution. For a set of data X,, ... , X,, a rank of 1 
is assigned to the smallest value, 2 to the next-smallest and so on up to a rank of n assigned 
to the largest. This applies wherever the values are distinct. Tied values are given the 
mean of the ranks they would receive if slightly different. The following is an example: 


X, 8 3 5 8 1 9 6 5 3 5 7 2 
Rank 10.5 3.5 6 10.5 1 12 8 6 3.5 6 9 2 
The Spearman rank correlation coefficient r; for data (X;, Yi), .. . , (X,, Y,) 1s the 


correlation of the ranks of X, and Y,, where the data X,,..., X, and Y;,..., Y, are 
ranked separately. If the number of tied values is small compared with n then 


n 


6 D 
rg=1-—— ) d; 
; РЕ 


El 
where d; is the difference between the rank of X, and that of Y;. The value of r, 
always lies in the interval [—1, 1], and adopts its extreme values only when the 
rankings precisely match (forwards or in reverse). 
To test for dependence, special tables must be used for small samples (n < 20), 
but for larger samples the test statistic 


Z=r,/(n — 1) 


is approximately standard normal. 


Example 11.16 


14 


Solution 
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Find and test the rank correlation for the data in Example 11.14. 


The data with their ranks are as follows: 


X, 4.4 5.5 4.2 3.0 4.5 4.9 4.6 5.0 4.7 эЛ 44 





Rank 55 22 4 1 7 14 85 165 105 18.5 5.5 
E 12 14 18 35 23 29 16 12 18 21 27 


Rank 2.5 6 11 22 18 21 7.5 2.5 11 15.5 19.5 


X, 4.1 4.9 4.7 5.0 4.6 3.6 4.9 5,1 4.8 32 5.2 
Rank 3 14 10.5 16.5 8.5 2 14 18.5 12 20.5 20.5 





Y, 13 19 22 20 16 27 21 13 18 17 11 
Rank 45 13 17 14 TO 195 155 45 11 9 1 





From this, the rank correlation is r; — —0.361, and Z — —1.66, which exceeds дуо = 
1.645 and is therefore just significant at the 10% level. If the approximate formula is 
used, the sum of squares of differences is 2398, so 


21 - (6202398) ... 354 
(22)(483) 


and Z = —1.62, which is just short of significance. 

These results show that the rank correlation is a more conservative test than the 
sample correlation ry, in that a larger sample tends to be needed before the hypothesis 
of independence is rejected. A price has to be paid for the wider applicability of the 
method. 


Fs 


11.4.7 Exercises 


Suppose that the random variables X and Y have the 15 Consider the random variable X with density 








following joint distribution: function 
=i 1 
f= |! Cic 
X 0 otherwise 
Y 1 2 3 Show that the covariance of X and X? is zero. 
(This shows that zero covariance does not imply 
1 0 0.17 0.08 independence, because obviously X? is dependent 
2 0.20 0.11 0 on X.) 
3 0.14 0.25 0.05 


16 The joint density function of random variables 
X and Y is 


Find (a) the marginal distributions of X and Y, 


(b) P(Y 2 3| X 2 2), and (c) the mean, variance Жу, у) = l 
and correlation coefficient of X and Y. " 


1 OSxSljyex Sy Scx+l1) 


0 otherwise 
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where c is a constant such that 0 x c < 1. el 
Find the marginal density function for Y 
(see Example 11.10). 


Let the random variables X and Y represent the 22 
lifetimes (in hundreds of hours) of two types 

of components used in an electronic system. 

The joint density function is given by 


Луб) = uer (x > 0, > 0) 
| 0 


otherwise 


Find (a) the probability that two components (one 
of each type) will each last longer than 100 h, and 
(b) the probability that a component of the second 
type (Y) will have a lifetime in excess of 200 h. 


The following are the measured heights and 
weights of eight people: 


23 
Height (cm) 
Weight (kg) 


182.8 162.5 175.2 185.4 170.1 167.6 177.8 172.7 
86.1 583 830 924 60.2 693 83.6 727 





Find the sample correlation coefficient. 


The number of minutes it took 10 mechanics to 
assemble a piece of machinery in the morning (X ) 
and in the late afternoon (Y ) were measured, with 
the following results: 


24 






X|1L1 103 
Y|109 142 


12.0 15.1 13.7 18.5 17.3 14.2 14.8 15.3 
13.8 21.5 13.2 21.1 16.4 19.3 17.4 19.0 


Find the sample correlation coefficient. 


If the sample correlation between resistance and 
failure time for 30 overloaded resistors is 0.7, find a 
95% confidence interval for the true correlation. 


Regression 


Find a 9596 confidence interval for correlation 
between height and weight using the data in 
Exercise 18. 


Marks obtained by 20 students taking examinations 
in mathematics and computer studies were as 
follows: 


Math. |45 77 43 64 58 64 58 54 71 45 
57 52 67 57 54 54 61 58 55 42 
Comp. | 64 67 47 75 42 65 58 42 70 44 
44 67 49 70 51 58 37 60 42 36 





Find the sample correlation coefficient and the 
90% and 95% confidence intervals. Hence test the 
hypothesis that the two marks are independent at 
the 5% and 10% significance levels. Also find and 
test the rank correlation. 


Let the random variables X and Y have joint density 
function given by 


с(1=у) 
0 otherwise 


(0(=х=уж= 1) 


nel 


Find (a) the value of the constant c, (b) P(x < : " 
у> 1) and (c) the marginal density functions for 
X and Y. 


The ball and socket of a joint are separately moulded 
and then assembled together. The diameter of the 
ball is a random variable X between 29.8 and 

30.3 mm, all values being equally likely. The 
internal diameter of the socket is a random variable 
Y between 30.1 and 30.6 mm, again with all values 
equally likely. The condition for an acceptable 
fitis that 0 x Y — X « 0.6 mm. Find the probability 
of this condition being satisfied, assuming that the 
random variables are independent. 


A procedure that is very familiar to engineers is that of drawing a good straight line 
through a set of points on a graph. When calibrating a measuring instrument, for example, 
known inputs are applied, the readings are noted and plotted, a straight line is drawn as 
close to the points as possible (there are bound to be small errors, so they will not all 
lie on the line), and the graph is then used to interpret the readings for unknown inputs. 
It is possible to draw the line by eye, but there is a better way, which involves calculating 
the slope and intercept of the line from the data. The given line then minimizes the total 


Figure 11.13 Scatter 
plot with regression 
line (Example 11.17). 
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squared error for the data points. This procedure (which for historical reasons is called 
regression) can be applied in general to pairs of random variables. 

Computer packages are very often used to carry out the regression calculations and 
display the results. This is of special value when the data tend to follow a curve and 
various nonlinear models are tried and compared (see Section 11.5.4). 


The method of least squares 


The correlation was introduced in Section 11.4.3 as a way of measuring the dependence 
between random variables. Subsequently, we have seen how the correlation can be 
estimated and the dependence tested using sample data. We can take the idea of correla- 
tion between variables (say X and Y) a stage further by assuming that the sample pairs 
{(X, Y;),-.-, (X,, Y,,)} satisfy a linear relationship of the form 
Y,=a+bX,+ EF, (i21,...,n) 

where a and b are unknown coefficients and the random variables E; have zero mean 
and represent residual errors. This assumption is prompted by the scatter diagrams 
in Figure 11.11, which illustrate how the points may be concentrated around a line. 
Figure 11.13 shows a typical scatter diagram again, this time with the line drawn in. If 
we can estimate the coefficients a and b so as to give the best fit, we shall be able to 
predict the value of Y when the value of X is known. 

The least-squares approach is to choose estimates d and É to minimize the sum of 
squares of the values E; 


n 2 n a А 2 

Q- Y E - Y tr, - (à 5x)] 
і=1 і=1 

Equating to zero the partial derivatives of this sum with respect to the two coefficients 

gives a pair of equations that determine the minimum: 


90 9. 2N 1y,- (44 bx, 
2 =0= гулу (á+ bX] 


à i E 
eO XDn- (1 53 


These can be rewritten as 
nd + (X, X) - (Y) 
(39440 3b — 0, X0 


(where >, = 37.) from which the solution is 
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Example 11.17 


Solution 


Sy - 1,063007 - Y) e XY - QD?) 


sse ix Qn -Xy - x'- qty 
n 
S;-ixq-P-r-dy 
are the sample variances and covariance. 


This process of fitting a straight line through a set of data of the form ((Xi, Yi), ..., 
(X,, Y,)) is called linear regression, and the coefficients are called regression coefficients. 


A strain gauge has been bonded to a steel beam, and is being calibrated. The resistance 
of the strain gauge 1s converted into a voltage appearing on a meter. Known forces (X, 
in kN) are applied and voltmeter measurements (Y, in V) are as follows: 





1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Y|44 49 64 73 88 103 117 132 148 153 165 172 189 19.3 


Fit a regression line through the data and estimate the tension in the beam when the 
meter reading is 13.8 V. 


The following quantities are calculated from the data: 


X275, S,-2409113, Y-12.0714, 5,24:95068, XY —110421 
(When using a hand calculator to solve linear regression problems, it is advisable to 
work to at least five or six significant digits during intermediate calculations, because 
the subtraction in the numerator of 5 often results in the loss of some leading digits.) 


From these results, b = 1.22 and d= 2.89 (Figure 11.13). The estimated value of tension 
for a reading of Y = 13.8 V is given by 


13.8 = 2.89 + 1.22X 
from which X = 8.9 КМ. 


Figure 11.14 shows a pseudocode listing for linear regression. The program is very 
similar to that in Figure 11.12 for the sample correlation, and the link between these 
will be explained in Section 11.5.3. In addition to the regression coefficients â and b, 
an estimate of the residual standard deviation is returned, which is explained below. 


Figure 11.14 
Pseudocode listing 
for linear regression. 
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{ Program to compute linear regression coefficients. 
x(k) and y(k) are the arrays of data, 

nis the sample size, 

xbar and ybar are the sample averages, 

sx and sy are the sample standard deviations, 

bhat is the regression slope result, 

ahat is the regression intercept result, 

se is the residual standard deviation, 

Mx, My, Qx, Qy, Qxy hold running totals. } 


Мх < 0; Му < 0 
Ох « 0; Оу < 0 
Оху < 0 
for k is 1 to n do 
Фіх < x(k) - Mx 
Фі у < y(k) - My 
Mx © ((k — 1)*Mx+x(k))/k 
Му < ((К – 1)*Му+у(Ю))/К 
Ох € Qx«(1—1l/k)diffxsdiffx 
Оу € Qy«(1-1l/k)diffysdiffy 
Оху < Qxy+(1-1/k)*diffx*diffy 
endfor 
хбаг < Мх; убаг < Му 
ѕх < sqr(Qx/n); sy € sqr(Qy/n) 
bhat «— Qxy/(n*sx*sx) 
ahat «— ybar — bhat*xbar 
ѕе < sqr(sy*sy — bhat*bhat*sx*sx) 








Normal residuals 


The process of fitting a straight line through the data by minimizing the sum of squares 
of the errors does not involve any statistics as such. However, we often need to test 
whether the slope of the regression line is significantly different from zero, because this 
will reveal whether there is any dependence between the random variables. For this 
purpose we must make the assumption that the errors E,, called the residuals, have a 
normal distribution: 


E,~ NO, 05) 


The unknown variance 07 can be estimated by defining 
I~ [Bp ix mno 2 
$2 == E; = = Y; 5 + bX; 
EU. 2, a 2 [ (d )] 
Using the earlier result that â = Y — bX gives a more convenient form: 
1 ay Йй EN 
52 = – Y,- Y)-b(X,- X 
im 2, [CY, - Y) - b(X; - X)] 
-iy 0,- P- 28X% Xy, - Y) Qc - Xy] 
n 


= S2 — 2bSyy + bS} 
=S} -6S3 
This result is used in the pseudocode listing (Figure 11.14). 
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Example 11.18 


Solution 


Various confidence intervals are derived in more advanced texts covering linear 
regression. Here the most useful results will simply be quoted. The 100(1 — a@)% 
confidence interval for the regression slope b is given by 


f S 
ak E 
(3 = Íoj2,n-2 S (n = z) 


It is often useful to have an estimate of the mean value of Y for a given value of X, 
say X = x. The point estimate 15 â + bx, and the 100(1 — œ)% confidence interval for 
this is 


» o 2;902 
dee. y E 
n-2 


Estimate the residual standard deviation and find a 95% confidence interval for the 
regression slope for the data in Example 11.17. Also test the hypothesis that the tension 
in the beam is 10kN when a voltmeter reading of 15 V is obtained. 


Using the results obtained in Example 11.17, the residual standard deviation is 
Sg = 0.418 
and, using 4975.1. = 2.179, the 95% confidence interval for b is 


0.418 


(122221790418. 
(4.031) /12 


Je aas 1.29) 


Obviously the regression slope is significant — but this is not in doubt. To test the 
hypothesis that the tension is 10 kN, we can use the 95% confidence interval for the 
corresponding voltage, which is 


2 2 
299+ 12200)2:179¢0818), || Ls ats /(4.031 


= (14.8, 15.4) 


The measured value of 15 V lies within this interval, so the hypothesis 15 accepted at 
the 5% level. A better way to approach this would be to reverse the regression (use 
force as the Y variable and voltage as the X variable), so that a confidence interval for 
the tension in the beam for a given voltage could be obtained and the assumed value 
tested. For the present data this gives (9.6, 10.1) at 95%, so again the hypothesis is 
accepted (Exercise 27). 


11.5.3 


11.5.4 


Figure 11.15 
Nonlinear regression 
(Example 11.19). 
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Regression and correlation 


Both regression and correlation are statistical methods for measuring the linear depend- 
ence of one random variable upon another, so it is not surprising that there is a connection 
between them. From the definition of the sample correlation ryy (Section 11.4.4) and 
the result for the regression slope b, it follows immediately that 

-Sw 58, 


а y 





Fyx,y 


Another expression for the residual variance is then 


А 0 E DNE CO 2 
Sz =Sy- та Sy =Sy(1-ryy) 
X 


This result has an important interpretation. 57 is the total variation in the Y values, and 
S$ is the residual variation after the regression line has been identified, so r5, is the 
proportion of the total variation in the Y values that is accounted for by the regression 
on X: informally, it represents how closely the points are clustered about the line. This 
is a measure of goodness of fit that is especially useful when the dependence between 
X and Y is nonlinear and different models are to be compared. 


Nonlinear regression 


Sometimes the dependence between two random variables is nonlinear, and this shows 
clearly in the scatter plot; see for instance Figure 11.15. Fitting a straight line through 
the data would hardly be appropriate. Instead, various models of the dependence can be 
assumed and tested. In each case the value of rj, indicates the success of the model in 
capturing the dependence. One form of nonlinear regression model involves a quadratic 
or higher-degree polynomial: 


Ү,=а,+а,Х,+а„Х?+Е, (ї=1,...‚,п) 


The three coefficients ap, a, and a, can be identified by a multivariate regression method 
that is beyond the scope of this text. A simpler approach is to try models of the form 


Y,=aX°F, (i21,...,n) 


P/mbar Å 
1000 
800 T 


600 + 


400 T 
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Example 11.19 


Solution 


Or 
Y,=F;exp(a+ bX;) (i=1,...,n) 


where each F; is a residual multiplicative error. On taking logarithms, these models 
reduce to the standard linear form: 


ln Y,=lna+blnX%, +E, (i=1,...,n) 
or 
InY;ica-cbX;- E, (i-l,...,n) 


which can be solved by the usual method. 


The following data for atmospheric pressure (P, in mbar) at various heights (H, in km) 
have been obtained: 


Height H 0 4 8 12 16 20 
Pressure P 1012 621 286 141 104 54 
The relationship between height and pressure is believed to be of the form 

P = её?Ҥ 


where a and b are constants. Fit and assess a model of this form and predict the atmos- 
pheric pressure at a height of 14 km. 


Taking logarithms and setting Y — In P gives 
Ү,=а+ЬН,+Е, 
for which the following results are easily obtained: 
H=10, S,=6.83130, Y=5.43152, Sy=1.01638, НҮ = 47.4081 


Hence 


p- EY-UDO) - 9g 
Si 


G=Y-bH=6.91 


Also, r7; — 0.99, which implies that the fit is very good (Figure 11.15). In this case 
there is not much point in trying other models. Finally, the predicted pressure at a height 
of I4 km is 


P = e691-0.148(14) = 126 mbar 


25 


26 


27 


28 


29 


11.5.5 Exercises 


Ten files of audio data are annotated by a human 
labeller. The time (s) this takes per file is a function 
of the length of the file (s) as follows: 


File length 5.4 7.9 10.0 14.2 16.1 16.8 19.6 22.0 25.0 26.7 
(X) 
Annotation 


time (Y) 


13.1 17.3 23.9 30.1 33.5 40.0 43.6 46.5 52.4 60.7 





Find the linear regression coefficients. 


Measured deflections (in mm) ofa structure under a 
load (in kg) were recorded as follows: 


Load X 123 45 6 7 8 9 10 11 12 
Deflection Y | 16 35 45 74 86 96 106 124 134 156 164 182 


Draw a scatter plot of the data. Find the linear 
regression coefficients and predict the deflection 
for a load of 15kg. 


Using the data in Example 11.17, carry out a 
regression of force against voltage, and obtain a 
95% confidence interval for the tension in the beam 
when the voltmeter reads 15 V, as described in 
Example 11.18. 


Weekly advertising expenditures X; and sales Y; for 
a company are as follows (in units of £100): 


X; | 40 20 25 20 30 50 40 20 50 40 25 50 
Y, | 385 400 395 365 475 440 490 420 560 525 480 510 


(a) Fita regression line and predict the sales for an 
advertising expenditure of £6000. 

(b) Estimate the residual standard deviation and find 
а 95% confidence interval for the regression 
slope. Hence test the hypothesis that the sales 
do not depend upon advertising expenditure. 

(c) Find a 9596 confidence interval for the mean 
sales when advertising expenditure is £6000. 


A machine that can be run at different speeds 
produces articles, of which a certain number are 
defective. The number of defective items produced 
per hour depends upon machine speed (in rev s!) as 
indicated in the following experimental run: 


Speed 8 9 10 11 12 13 14 15 
Defectives | 7 12 13 13 13 16 14 18 
per hour 





30 


S 


32 


88 
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Find the regression line for number of defectives 
against speed, and a 90% confidence interval for the 
mean number of defectives per hour when the speed 
is 14 rev s”. 


Sometimes it is required that the regression line 
passes through the origin, in which case the only 
regression coefficient is the slope of the line. 
Use the least-squares procedure to show that the 
estimate of the slope is then 


Б = (У, ХҮ), Х? 
A series of measurements of voltage across and 


current through a resistor produced the following 
results: 


Voltage (V) |1 2 3 4 5 6 7 8 9 10 11 12 
Current (mA) | 6 18 27 30 42 48 58 69 74 81 94 99 


Estimate the resistance, using the result of the 
previous exercise. 


The pressure P of a gas corresponding to various 
volumes V was recorded as follows: 


50 60 70 90 100 
64.7 513 40.5 25.9 7.8 


V (cm?) 
P (kgcm?) 





The ideal gas law is given by the equation 
Р^=С 


where å and C are constants. By taking 
logarithms and using the least-squares method, 
estimate A and C from the data and predict P 
when V 2 80 cm*. 


The following data show the unit costs of producing 
certain electronic components and the number of 
units produced: 


50 100 250 500 1000 2000 5000 
108 65 21 13 4 22 1 


Lot size X; 
Unit cost Y; 





Fit a model of the form Y = aX” and predict the unit 
cost for a lot size of 300. 
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11.6 


11.6.1 


Figure 11.16 
The chi-square 
distribution 
with y2,.. 


Goodness-of-fit tests 


The common classes of distributions, especially the binomial, Poisson and normal dis- 
tributions, which often govern the data in experimental contexts, are used as the basis 
for statistical methods of estimation and testing. A question that naturally arises is whether 
or not a given set of data actually follows an assumed distribution. If it does then the 
statistical methods can be used with confidence. If not then some alternative should be 
considered. The general procedure used for testing this can also be used to test for 
dependence between two variables. 


Chi-square distribution and test 


No set of data will follow an assumed distribution exactly, but there is a general method 
for testing a distribution as a statistical hypothesis. If the hypothesis is accepted then it 
is reasonable to use the distribution as an approximation to reality. 

First the data must be partitioned into classes. If the data consist of observations 
from a discrete distribution then they will be in classes already, but it may be appropriate 
to combine some classes if the numbers of observations are small. For each class the 
number of observations that would be expected to occur under the assumed distribution 
can be worked out. The following quantity acts as a test statistic for comparing the 
observed and expected class numbers: 


ТУ (е)? 


е, 





where f, is the number of observations in the kth class, e, is the expected number in the 
kth class and m is the number of classes. Clearly, y? is a non-negative quantity whose 
magnitude indicates the extent of the discrepancy between the histogram of data and 
the assumed distribution. For small samples the histogram is erratic and the comparison 
invalid, but for large samples the histogram should approximate the true distribution. It 
can be shown that for a large sample the random variable y* has a ‘chi-square’ dis- 
tribution. This class of distributions is widely used in statistics, and a typical chi-square 
probability density function is shown in Figure 11.16. We are interested in particular in 
the value of y; to the right of which the area under the density function curve is a, 
where n is the (single) parameter of the distribution. These values are extensively 
tabulated; a typical table is shown in Figure 11.17. 





we 
Kan 


Figure 11.17 
Table ofthe chi-square 
distribution y; . 
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n a= 0.05 a= 0.025 a=0.01 a= 0.005 n 





1 3.841 5.024 6.635 7.879 1 

2 5.991 7.378 9.210 10.597 2 

3 7.815 9.348 11.345 12.838 3 
4 9.488 11.143 13.277 14.860 4 

5 11.070 12.832 15.086 16.750 5 

6 12.592 14.449 16.812 18.548 6 

7 14.067 16.013 18.475 20.278 7 

8 15.507 17.535 20.090 21:955 8 

9 16.919 19.023 21.666 23.589 9 
10 18.307 20.483 23.209 25.188 10 
11 19.675 21.920 24.725 26.757 11 
12 21.026 23.337 26.217 28.300 12 
13 22.362 24.736 27.688 29.819 13 
14 23.685 26.119 29.141 31.319 14 
15 24.996 27.488 30.578 32.801 15 
16 26.296 28.845 32.000 34.267 16 
17 27.587 30.191 33.409 35.718 17 
18 28.869 31.526 34.805 37.156 18 
19 30.144 32.852 36.191 38.582 19 
20 31.410 34.170 37.566 39.997 20 
21 32.671 35.479 38.932 41.401 21 
22 33.924 36.781 40.289 42.796 22 
23 35.172 38.076 41.638 44.181 23 
24 36.415 39.364 42.980 45.558 24 
25 37.652 40.646 44.314 46.928 25 
26 38.885 41.923 45.642 48.290 26 
27 40.113 43.194 46.963 49.645 27 
28 41.337 44.461 48.278 50.993 28 
29 42.557 45.722 49.588 52.336 29 
30 43773 46.979 50.892 53.672 30 


The hypothesis of the assumed distribution is rejected if 


2 2 
x > Xa,m-t-1 


where o is the significance level and t is the number of independent parameters esti- 
mated from the data and used for computing the e, values. The significance level is the 
probability of false rejection, as discussed in Section 11.3.4. The only difference is that 
there is now no estimation procedure underlying the hypothesis test, which must stand 
alone. Sometimes the hypothesis is deliberately vague, for example a parameter value 
may be left unspecified. If the data themselves are used to fix parameter values in the 
assumed distribution before testing then the test must be strengthened to allow for this 
in the form of a correction ft in the chi-square parameter. 

A useful rule of thumb when using this test 1s that there should be at most a small 
number (one or two) of classes with an expected number of observations less than five. 
If necessary, classes in the tails of the distribution can be merged. 
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Solution 


Example 11.21 


Solution 


Figure 11.18 
Chi-square calculation 
for Example 11.21. 


A die is tossed 600 times, and the numbers of occurrences of the numbers one to six are 
recorded respectively as 89, 113, 98, 104, 117 and 79. Is the die fair or biased? 


The expected values are e, = 100 for all k, and the test value is y? — 10.4. This is less 
than y2os5 — 11.07, so we should expect results as erratic as this at least once in 20 
similar experiments. The die may well be biased, but the data are insufficient to justify 
this conclusion. 


The numbers of trucks arriving per hour at a warehouse are counted for each of 500 h. 
Counts of zero up to eight arrivals are recorded on respectively 52, 151, 130, 102, 45, 
12, 5, 1 and 2 occasions. Test the hypothesis that the numbers of arrivals have a Poisson 
distribution, and estimate how often there will be nine or more arrivals in one hour. 


The hypothesis stipulates a Poisson distribution, but without specifying the parameter 
A. Since the mean of the Poisson distribution is A and the average number of arrivals 
per hour is 2.02 from the data, it is reasonable to assume that A = 2. The columns in the 
table in Figure 11.18 show the observed counts f,, the Poisson probabilities p,, the 
expected counts e, 2 500 p, and the individual y? values for each class. The last two 
classes have been combined because the numbers are so small. One parameter has 
been estimated from the data, so the total 7? value is compared with y5,,, — 12.59. The 
Poisson hypothesis is accepted, and on that basis the probability of nine or more trucks 
arriving in one hour is 


8 k -2 


P(9 ormore) -1- Y 25 = 0.000 237 
k=0 ` 





This will occur roughly once in every 4200 h of operation. 





Trucks Kk Pi е Y 

0 52 0.1353 67.7 3.63 
1 151 0.2707 135.3 1.81 
2 130 0.2707 135.3 0.21 
3 102 0.1804 90.2 1.54 
4 45 0.0902 45.1 0.00 
5 12 0.0361 18.0 2.02 
6 3 0.0120 6.0 0.17 
7 3 0.0046 2:3 0.24 





Totals 500 1.0 500 9.62 


Because so many statistical methods assume normal data, it is important to have a test 
for normality, and the chi-square method can be used (Exercise 38 and Section 11.8.4). 


11.6.2 


Figure 11.19 
Contingency table. 
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Contingency tables 


In Section 11.4.3 the correlation was introduced as a measure of dependence between 
two random variables. The sample correlation (Section 11.4.4) provides an estimate 
from data. This measure only applies to numerical random variables, and then only 
works for linear dependence (Exercise 15). The rank correlation (Section 11.4.6) has 
more general applicability, but still requires that the data be classified in order of rank. 
The chi-square testing procedure can be adapted to provide at least an indicator of 
dependence that has the widest applicability of all. 

Suppose that each item in a sample of size n can be separately classified as one of 


A,,...,A, by one criterion, and as one of B,,..., B. by another (these may be numer- 
ical values, but not necessarily). The number f;; of items in the sample that are classified 
as *4; and B;' can be counted for each i 2 1, ..., rand j— 1, ..., c. The table of these 


numbers (with r rows and c columns) is called a contingency table (Figure 11.19). The 
question is whether the two criteria are independent. If not then some combinations of 
A; and B, will occur significantly more often (and others less often) than would be 
expected under the assumption of independence. We first have to work out how many 
would be expected under an assumption of independence. 











1 | 

Class | В, E | B, | Total 
і і 

aL E EARLIER 
i | 
І | 
| | 
| | 
| І 

се esses cedes 
Г 

А, Ўл $ = Se Jey 
1 і 

Total | fi, ed d n 





Let the row and column totals be denoted by 


љ= У, (dos 


had f (j21l,...,c) 


If the criteria are independent then the joint probability for each combination can be 
expressed as the product of the separate marginal probabilities: 


P(A, B) = P(A) P(B)) 


The chi-square procedure will be used to see how well the data fit this assumption. To 
test it, we can estimate the marginal probabilities from the row and column totals, 


pa) =, (By = 
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Example 11.22 


Figure 11.20 
Contingency table 
for Example 11.22. 


and multiply the product of these by n to obtain the expected number e; for each 
combination: 


e, alil fadi 
nn n 


The chi-square goodness-of-fit statistic follows from the actual and expected numbers 
(Jy and ej) as a sum over all the rows and columns: 


2 wx ye) 
fay yee! 


i=l j=l 





If the value of this is large then the hypothesis of independence is rejected, because 
the actual and expected counts differ by more than can be attributed to chance. As 
explained in Section 11.6.1, the largeness is judged with respect to 7, бот ће 
chi-square table. The number of classes, m, is the number of rows times the number of 
columns, rc. The number of independent parameters estimated from the data, f, is the 
number of independent marginal probabilities P(4;) and P(B;): 


= (г — 1)+(с— 1) 


The number is not r + c, because the row and column totals must equal one, so when 
all but one are specified, the last is determined. Finally, 


m-—-t—lzrc-(r*c-2)-1z(r-1yYc- 1) 





The hypothesis of independence is therefore rejected (at significance level o) if 


2 2 
X 7 Xac-u0(-) 


An accident inspector makes spot checks on working practices during visits to indus- 
trial sites chosen at random. At one large construction site the numbers of accidents 
occurring per week were counted for a period of three years, and each week was also 
classified as to whether or not the inspector had visited the site during the previous 
week. The results are shown in bold print in Figure 11.20. Do visits by the inspector 
tend to reduce the number of accidents, at least in the short term? 


Number of accidents 














0 1 2 3 Total 
Visit 20 (13.38) 3 (7.08) 1 (2.46) 0 (1.08) 24 
Residual 2.96 —1.99 —1.07 —1.16 
No visit 67 (73.62) 43 (38.92) 15 (13.54) 7 (5.92) 132 
Residual —2.96 1.99 1.07 1.16 
Total 87 46 16 7 156 
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Solution The respective row and column totals are shown in Figure 11.20, together with the 
expected numbers e;; in parentheses in each cell. For example, the top left cell has 
observed number 20, row total 24, column total 87, n = 156, and hence the expected 
number 


€11 = (24)(87)/156 = 13.38 


The chi-square sum is 


2. (20-1338) (7 - 5.92)” 
Ж = Лу + 0) 


= 8.94 


With two rows, four columns and a significance level of 5%, the appropriate number 
from the chi-square table is 79;,3 = 7.815. The calculated value exceeds this, and by 
comparing the observed and expected numbers in the table, it seems clear that the visits 
by the inspector do tend to reduce the number of accidents. This is not, however, 
significant at the 2.5% level. 


A significant chi-square value does not by itself reveal what part or parts of the table 
are responsible for the lack of independence. A procedure that is often helpful in this 
respect is to work out the adjusted residual for each cell, defined as 


d, = ——__#i eis 00. 
"o JIey(1 - fn) -fij/n)] 


Under the assumption of independence, these are approximately standard normal, so a 
significant value for a cell suggests that that cell is partly responsible for the depend- 
ence overall. The adjusted residuals for the contingency table in Example 11.22 are 
shown in Figure 11.20, and support the conclusion that visits by the inspector tend to 
reduce the number of accidents. 

For a useful survey of procedures for analysing contingency tables see B. S. Everitt, 
The Analysis of Contingency Tables, 2nd edn, Chapman & Hall, London, 1992. 


11.6.3 Exercises 


In a genetic experiment, outcome A is expected to 35 The number of books borrowed from a library 
occur twice as often as outcome B, which in turn is that is open five days a week is as follows: 
expected to occur twice as often as outcome C, and Monday 153, Tuesday 108, Wednesday 
exactly one of these three outcomes must occur. In 120, Thursday 114, Friday 145. Test (at 

a sample of size 100, outcomes A, B, C occurred 5% significance) whether the number of 

63, 22, 15 times respectively. Test the hypothesis at books borrowed depends on the day of 


5% significance. the week. 
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S 


38 
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A new process for manufacturing light fibres is 

being tested. Out of 50 samples, 32 contained no 

flaws, 12 contained one flaw and 6 contained two 

flaws. Test the hypothesis that the number of flaws 40 
per sample has a Poisson distribution. 


In an early experiment on the emission of 
a-particles from a radioactive source, Rutherford 
obtained the following data on counts of particles 
during constant time intervals: 


Number of 
particles 

Number of | 57 203 383 525 532 408 273 139 45 27 10 6 
intervals 






Test the hypothesis that the number of particles 
emitted during an interval has a Poisson 
distribution. 


Two samples of 100 data have been grouped into 

classes as shown in Figure 11.21. The sample 

average and standard deviation in each case 

were 10.0 and 2.0 respectively. 41 


(a) Draw the histogram for each sample. 
(b) Test each sample for normality using the 
measured parameters. 





Class Sample 1 Sample 2 
«6.5 4 3 
6.5—7.5 6 6 
7.5—8.5 16 16 
8.5-9.5 16 13 
9.5—10.5 17 26 
10.5—11.5 20 7 
11.5—12.5 12 19 
12.5—13.5 6 5 
>13.5 3 5 


Figure 11.21 Data classification for 
Exercise 38. 42 


Shipments of electronic devices have been 
received by a firm from three sources: A, B 

and C. Each device is classified as either 

perfect, intermediate (imperfect but acceptable), 
or unacceptable. From source A 89 were 

perfect, 23 intermediate and 12 unacceptable. 
Corresponding figures for source B were 62, 12 
and 8 respectively, and for source C 119, 30 and 
2] respectively. Is there any significant difference 


in quality between the devices received from the 
three sources? 


Cars produced at a factory are chosen at random 
for a thorough inspection. The number inspected 
and the number of those that were found to be 
unsuitable for shipment were counted monthly 
for one year as follows: 


Month Jan. Feb. Mar. Apr. May Jun. 


Inspected | 450 550 550 400 600 450 
Defective 8 14 6 3 7 8 





Month Jul. Aug. Sep. Oct. Nov. Dec. 





Inspected | 450 200 450 600 600 550 
Defective 16 5 12 6 15 9 





Is there a significant variation in quality through 
the year? 


Customers ordering regularly from an on-line 
clothing catalogue are classed as low, medium and 
high spenders. Considering four products from the 
catalogue (a jacket, a shirt, a pair of trousers, and a 
pair of shoes), the numbers of customers in each 
class buying these products over a fixed period of 
time are given in the following table: 





Spending level | Jacket | Shirt | Trousers | Shoes 
Low 21 94 57 113 
Medium 66 157 94 209 
High 58 120 41 125 





Does this table provide evidence that customers 
with different spending levels tend to choose 
different products? 


A quality control engineer takes daily samples of 
four television sets coming off an assembly line. 
In a total of 200 working days he found that on 
110 days no sets required adjustments, on 73 days 
one set requires adjustments, on 16 days two sets 
and on 1 day three sets. Use these results to test 
the hypothesis that 1096 of sets coming off the 
assembly line required adjustments, at 5% and 1% 
significance levels. Also test this using the 
confidence interval for proportion (Section 11.3.6), 
using the total number of sets requiring adjustments. 


11.7 


11.7.1 


11.7 MOMENT GENERATING FUNCTIONS 953 


Moment generating functions 


This section is more difficult than the rest of this chapter, and can be treated as optional 
during a first reading. It contains the proofs of theoretical results, such as the central 
limit theorem, that are of great importance, as seen earlier in this chapter. The technique 
introduced here is the moment generating function, which is a useful tool for finding 
means and variances of random variables as well as for proving these essential results. 
The moment generating function also bears a striking resemblance to the Laplace trans- 
form considered in Chapter 5. 


Definition and simple applications 


The moment generating function of a random variable X is defined as 


m tv к . 
У e ‘P(X=v,)  inthe discrete case 


К=1 


My(t) = Е(еу = 
e" fy (x) dx in the continuous case 


E 


where ft is a real variable. This is an example of the expected value of a function A(X) 
of the random variable X (Section 11.2.2). The moment generating function does not 
always exist, or it may exist only for certain values of f. In cases where it fails to exist 
there is an alternative (called the characteristic function) which always exists and has 
similar properties. When the moment generating function does exist, it is unique in the 
sense that no two distinct distributions can have the same moment generating function. 

To see how the moment generating function earns its name, the first step is to differ- 
entiate it with respect to ¢ and then let ¢ tend to zero: 


амо) lo = E(X 6%) lo = EX) = Hy 


Thus the first derivative gives the mean. Differentiating again gives 
d 

Ss My(t) ho = EX? e*) | = E(QX?) 

dt 

From this result we obtain the variance (from Section 11.2.2) 


Var(X) 2 E(X?) — [E(X)P 


We can summarize these results as follows. If a random variable X has mean Ly, 
variance o ; and moment generating function M;(f) defined for t in some neighbour- 
hood of zero then 


2 
их=Мү' (00, ox = (MY'(0) - [MY(0)T) 
where the superscript in parentheses denotes the order of the derivative. Furthermore, 


E(X = М®(0) k=(1,2,...) 
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Example 11.23 


Solution 


Example 11.24 


Solution 


Show that the mean and variance of the Poisson distribution with parameter A are both 
equal to A (Section 11.2.3). 


If X has a Poisson distribution then 


k -À 
Ae 


P= b= 





(k—0,1,2,...) 
and the moment generating function is 
e uA e^ — Ax (Ae) t 
My(t) = Уе ге >, ue exp [A(e' - 1)] 
k=0 k=0 
Differentiating this with respect to t gives 
MY(t) = Ae'exp[A(e'- 1)] 
MẸ (t) = (4 e” + Ace’) exp[A(e'- 1)] 
from which 
их= Му (0) =А 
ох= Му (0)-их=А 


as expected. 


Show that the mean and standard deviation of the exponential distribution 


Ае” (х 2 0) 


fx) -| 
0 (x < 0) 


are both equal to 1/A (Section 11.10.2). 


The moment generating function is 


Г 
мое | зе dx= >> 


Note that the integral only exists for £ < À, but we can differentiate it and set t = 0 
for any positive value of A: 


MY'(0) =A" = bx 
MY (0) = 24” 


from which 0% = А. 
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11.7.2 The Poisson approximation to the binomial 


In addition to its utility for finding means and variances, the moment generating func- 
tion is a very useful theoretical tool. In this section and the next we shall use it to prove 
two of the most important results in probability theory, but first we need the following 
general property of moment generating functions. 


Suppose that X and Y are independent random variables with moment generating 
functions M,(f) and M,(t). Then the moment generating function of their sum is 
given by 


Myy(t) 2 My (My ($) 


To prove this, we shall assume that X and Y are continuous random variables with a 
joint density function f, ,(x, y); however, the proof is essentially the same if either or 
both are discrete. By definition, 


Myy(t) = етте | | e? fiy x, y) dxdy 


Both factors of the integrand themselves factorize (noting the independence of X and 
Y) to give 


My,y(t) = | | le" fr (x) Le’ Fy (y)] dx dy 
The two integrals can now be separated, and the result follows: 


M«t- | Zu e"fy (y) dy 


= My(x)My(y) 
It follows that if (. X1, . . . , X,} are independent and identically distributed random 
variables, each with moment generating function M,(f), and if Z=X,+...+X,, 


then 
М0 = [Муд 


We now proceed to the main result. If the random variable Y has a binomial distribu- 
tion with parameters n and p and the random variable X has a Poisson distribution 
with parameter A = np then as n — oe and p — 0, the distribution of X tends to that 
of Y. 


First let the random variable B have a Bernoulli distribution with parameter p 
(Section 11.2.3). The moment generating function of B is then 


Ms(f) - ep * (1- p) 2 19 p(e — 1) 


Since the binomial random variable Y is the sum of n copies of B, it follows that the 
moment generating function of the binomial distribution is 


My(t) = [1 + p(e' - Dl 
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11.7.3 


Theorem 11.1 


Proof 


It can be proved that for any real z 


lim (1 + z) =e 
noo n 
Now let z 2 np(e' — 1) and assume that A = np: 
M,(t) (1 + z) = e = exp[A(e' - 1)] = My) 
n 


(the moment generating function for the Poisson distribution was derived in Example 
11.23). The uniqueness of the moment generating function implies that the distributions 
of X and Y must be similar, provided that n is sufficiently large. It is also required that 
р — 0 as n — ce, so that z does not grow without bound. 

This approximation has many applications; see for example Section 11.9.2. 


Proof of the central limit theorem 


This theorem is of vital importance in statistics because it tells us that the sample aver- 
age for a sample of at least moderate size tends to have a normal distribution even when 
the data themselves do not. The result (Section 11.2.4) is here restated and proved using 
the moment generating function. 


The central limit theorem 


If {X,, ..., X,} are independent and identically distributed random variables, each 
with mean [ly and variance 0, and if 


E no ne 
Ox |n 


then the distribution of Z, tends to the standard normal distribution as n > œ. 


Z 


n 


First let the random variables Y, be defined by 
(i=1,...,7) 


so that 
Е(Ү,) = 0 
EY) =t} 
n 


3 e s 
Е(Ү;) = ——-, where cis a constant 
nyn 
N 


Expanding the moment generating function M;(t) for each Y, as a Maclaurin series (to 
four terms) gives 


2 3 
My(t) ~= M,(0) + MY(0)t+ MP0) + M90) m 


43 


44 


45 
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By the results in Section 11.7.1, the coefficients of this series can be replaced by the 


successive moments E(Y ^): 


2 3 
Му) = EQ) € EY) EY) € EQ = 1+ 


Now Z, is the sum of the variables Y,, 


Z-Y 
l 


Ê cf 


2n бип 


so the moment generating function of Z, is the nth power of that of Y;: 


Mz,(t) - LMyCOI" 


Retaining only the first two terms of a binomial expansion of this gives 


n 2n 


/2 


2 
>e” asno 


2\n-1 É 
Co 
блуп 


Mz (t) = (1 * LJ. nf +5 


because all terms except the first will tend to zero. 
It only remains to show that this is the moment generating function for the standard 


normal distribution; see Exercise 47. 


11.7.4 Exercises 


A continuous random variable X has density 
function 


-2x 
n a (x > 0) 
0 


(x S 0) 

46 
Find the value of the constant c. Derive the moment 
generating function, and hence find the mean and 
variance of X. 


Prove that if X;, . . 
random variables with parameters А1, .. 
their sum 


. , X, are independent Poisson 
. , À, then 


Y=X,+...+X, 


is also Poisson, with parameter A =A, +... + A, 


A factory contains 30 machines each with 

breakdown probability 0.01 in any one hour and 47 
40 machines each with breakdown probability 

0.005 in any one hour. Use the result of Exercise 44, 


end of theorem 


together with the Poisson approximation to the 
binomial, to find the probability that a total of 
three or more machine breakdowns will occur 
in any one hour. 


A manufacturer has agreed to dispatch small 
servomechanisms in cartons of 100 to a distributor. 
The distributor requires that 90% of cartons contain 
at most one defective servomechanism. Assuming 
the Poisson approximation to the binomial, obtain 
an equation for the Poisson parameter A such that 
the distributor’s requirements are just satisfied. 
Solve for A by one of the standard methods for 
nonlinear equations (approximate solution 0.5), 
and hence find the required proportion of 
manufactured servomechanisms that must be 
satisfactory. 


Show that the moment generating function 
of the standard normal distribution is e' ^ 
(Hint: Complete the square in the exponential.) 
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11.8 Engineering application: analysis of engine 


11.8.1 


Figure 11.22 Data for 
engine case study. 


performance data 


Introduction 


Statistical methods are often used in conjunction with each other. So far in this chapter 
the examples and exercises have almost always been designed to illustrate the various 
topics one at a time. This section contains an example of a more extended problem to 
which several topics are relevant, and correspondingly there are several stages to the 
analysis. 

The background to the problem is this. Suppose that the fuel consumption of a car 
engine is tested by measuring the time that the engine runs at constant speed on a litre 
of standard fuel. Two prototype engines, A and B, are being compared for fuel con- 
sumption. For each engine a series of tests is performed in various ambient temperatures, 
which are also recorded. Figure 11.22 contains the data. There are 30 observations each 
for the four random variables concerned: 


A, running time in minutes for engine A; 

T, ambient temperature in degrees Celsius for engine A; 

B, running time in minutes for engine B; 

U, ambient temperature in degrees Celsius for engine B. 
The histograms for the running times are compared in Figure 11.23(a) and those for the 
temperatures in Figure 11.23(b). The overall profile of temperatures is very similar for 
the two series of tests, differing only in the number of unusually high or low figures 
encountered. The profiles of running times appear to differ rather more markedly. It is 


clear that displaying the data in this way is useful, but some analysis will have to be 
done in order to determine whether the differences are significant. 








Engine A Engine B 
A T A T B U B U 
27.7 24 24.1 7 24.9 13 24.3 17 
24.3 25 23.1 14 21.4 19 24.5 16 
23.7 18 23.4 16 24.1 18 26.1 18 
22.1 15 23.1 9 27.5 19 27.7 14 
21.8 19 24.1 14 27.5 21 24.3 19 
24.7 16 28.6 23 25.7 17 26.1 5 
23.4 17 20.2 14 24.9 17 24.0 17 
21.6 14 25.7 18 23.3 19 24.9 18 
24.5 18 24.6 18 22.5 21 26.7 23 
26.1 20 24.0 12 28.5 12 27.3 28 
24.8 15 24.9 18 25.9 17 23.9 18 
23.7 15 21.9 20 26.9 13 23.1 10 
25.0 22 25.1 16 27.7 17 25.5 25 
26.9 18 25.7 16 25.4 23 24.9 22 
23.7 19 23.5 11 25:3 30 25.9 16 





Figure 11.23 
Histograms of engine 
data: (a) running times; 
(b) temperatures. 
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Number 


с к к ч RU GC з бо хо 





Number [4] 





8 
7 
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5 
4 
3 
2 px 
| | 
1 ЧГ г 
0 = 
5 10 125 15 175 20 225 25 275 30 
(b) Ambient temperature/°C 


When planning the analysis, it is as well to consider the questions to which most 
interest attaches. Do the mean running times for the two engines differ? Does the run- 
ning time depend on temperature? If so, and if there is a difference in the temperatures 
for the test series on engines A and B, can this account for any apparent difference in 
fuel consumption? More particularly, are the data normal? This has a bearing on the 
methods used, and hence on the conclusions drawn. 


Difference in mean running times and temperatures 


The sample averages and both versions of standard deviation for the data in Figure 11.22 
are as follows: 


422420,  T-21670,  B-2536 0=18.07 
S,=1.761,  $,24001,  $,21.6575, S,=4.932 
S,,4,7 1.791, $5,,74070, Spy = 1.685, $5,,— 5.017 


The average running time for engine B is slightly higher than for engine A. The sample 
standard deviations are very similar, so we can assume that the true standard deviations 
are equal and use the method for comparing means discussed in Section 11.3.5. The 
pooled estimate of the variance is 


52 = DGLA + Soni) _ 3.208 + 2.839 


2 = 3.023 
2(n - 1) 2 
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and the relevant value from the ¢-table is f).9);53 = 1.960. In fact the sample is large 
enough for the value for the normal distribution to be taken. The 95% confidence 
interval for the difference Ug — 4, 1s 


(B — А + 1.965,//15) = (0.28, 2.04) 


Even the 99% confidence interval (0.12, 2.20) does not contain the zero point, so we 
can be almost certain that the difference in mean running times is significant. 

Following the same procedure for the temperatures gives the 95% confidence inter- 
val for the difference Li — U, to be (—0.94, 3.68). Superficially, this is not significant — 
and even if the running times do depend on temperature, the similarity of the two test 
series in this respect enables this factor to be discounted. If so, the fuel performance for 
engine B is superior to that for engine A. However, if the temperature sensitivity were 
very high then even a difference in the average too small to give a significant result by 
this method could create a misleading difference in the fuel consumption figures. This 
possibility needs to be examined. 


Dependence of running time on temperature 


The simplest way to test for dependence is to correlate times and temperatures for each 
engine. To compute the sample correlations, we need the following additional results 
from the data: 


AT = 407.28, ВО – 457.87 


The sample correlations (Section 11.4.4) of A and T and of B and U are then 


ipo OSD - 0.445, oum EU 0.030 
, 5,51 i Sg Su 


and the respective 95% confidence intervals (Section 11.4.5) are 
(0.10, 0.69), (—0.39, 0.34) 


This is a quite definitive result: the running time for engine A depends positively upon 
the ambient temperature, but that for engine B does not. The confidence intervals are 
based on the assumption that all the variables A, 7, B and U are normal. The histograms 
have this character, and a test for normality will be covered later. 

Linear regression also reveals the dependence of running time on temperature. Here 
we assume that the variables are related by a linear model as follows: 


A; 2 c4 d4T, ^ E; 
а | sous cs 


В,= Cg * dgU; t F; 
where c4, d4, Cg and d are constants, and the random variables E, and F, represent 


residual errors. Using the results of the least-squares analysis in Section 11.5.1, for 
engine A we have 


d, =A -= 0,196,  ĉ,=Ã- d? = 209 
T 
Likewise, for engine B 


d,-—0.00,  &,-25.5 


Time/min 
30 


28 


26 


24 


22 


20 
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(a) (b) 


Figure 11.24 Regression of running time against temperature: (a) engine A; (b) engine B. 


Figure 11.24 contains scatter plots of the data with these regression lines drawn. The 
points are well scattered about the lines. The residual variances, using the results in 
Sections 11.5.2 and 11.5.3, are 


S} = 52 — 445: = 5З(1 — r} r) = 2.49 
52 = 2.74 


As explained in Section 11.5.3, the respective values of r? indicate the extent to which the 
variation in running times is due to the dependence on temperature. For engine B there 
is virtually no such dependence. For engine A we have r7; — 0.198, so nearly 2096 of 
the variation in running times is accounted for in this way. 

If we assume that the residuals E; and F; are normal, we can obtain confidence 
intervals for the regression slopes. The appropriate value from the f table is 1,5, ; = 
to.025,28 = 2.048, so the 95% confidence interval for d, is 





TN 


(4, + 2.048 5E 3 = (0.04, 0.35) 
The significance shown here confirms that found for the correlation. The 95% confidence 
interval for d, is (—0.14, 0.12). 

We now return to the main question. We know that the average running time for 
engine A was significantly lower than that for engine B. However, we also know that 
the running time for engine A depends on temperature, and that the average temperature 
during the test series for engine A was somewhat lower than for engine B. Could this 
account for the difference? 

A simple way to test this is to look at the average values. The difference in average 
temperatures is (18.07 — 16.70) = 1.37 and the regression slope d, is estimated to be 
0.196, so that a deficit in the average running time of (1.37)(0.196) = 0.27 min is 
predicted on this basis. This cannot account for the actual difference of 1.16 min, and 
does not even bring the zero point within the 95% confidence interval for the difference 
in means Lp — Ly. 

To examine this more carefully, we can try using the regression slopes to correct all 
running times to the same temperature. Choosing the average temperature for the series 
B for this purpose, we can let 
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Figure 11.25 
Predicted difference in 
mean running times. 


11.8.4 


Difference/min 





Temperature/°C 


-4 





X,- A, d (U - a 
Y, = B, + d,(U - U;) 


The 95% confidence interval for the difference between mean running times, [ly — Ly, 
at this temperature is then (0.06, 1.72), using the method in Section 11.3.5. The problem 
with this is that the estimates of d, and d, themselves have rather wide confidence 
intervals, and it is not satisfactory to adopt a point value to apply the temperature 
correction. 

We need a more direct way to obtain the confidence interval for the difference 
between mean running times at any particular temperature. This is possible using a 
more general theory of linear regression, formulated using matrix algebra, which allows 
for any number of regression coefficients instead of just two; see for example G. A. F. 
Seber, Linear Regression Analysis (Wiley, New York, 1977). Applied to the present 
problem, estimates of the regression slopes d; and d; and intercepts c, and c, are obtained 
simultaneously, with the same values as before, and it now becomes possible to obtain 
a confidence interval for any linear combination of these four unknowns. In particular, 
the confidence interval for the difference between mean running times at any temperature 
t is based on the linear combination 


(cg + dst) — (cy + dyt) 


Space precludes coverage of the analysis here, but the results can be seen in Figure 11.25. 
At any temperature below 22.4°C engine B is predicted to have the advantage over 
engine A (this temperature being the point at which the regression lines cross). At any 
temperature below 18.1°C the 95% confidence interval for the difference is entirely 
positive, and we should say that engine B has a significant advantage. This is the best 
comparison of the engines that is possible using the data presented. 


Test for normality 


The confidence interval statistics are all based on the assumption of normality of the 
data. Although the sample sizes are reasonably large, so that the central limit theorem 
can be relied upon to weaken this requirement, it is worth applying a test for normality 
to see whether there is any clear evidence to the contrary. Here the regression residuals 
E, and F; will be tested using the method described in Section 11.6.1. 

Figure 11.26 shows the histogram of all 60 ‘standardized’ residuals. The residuals 
have zero mean in any case, and are standardized by dividing by the standard deviation 
so that they can be compared with a standard normal distribution. It is convenient to use 
intervals of width 0.4, and the comparison is developed in Figure 11.27. 


Figure 11.26 
Histogram of residuals. 


Figure 11.27 Table of 
the test for normality. 


11.8.5 
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i2 
10 
8 
6 
4 
[| [| | 
oL 

-2.4 -2.0 -1.6 -1.2 -0.8-04 0 04 O8 12 16 20 24 
Interval Observed ( f) Probability Expected (e;) Chi-square 
(79e, -1.4) 5 0.0808 4.848 0.005 
(71.4, —1.0) 4 0.0779 4.674 0.097 
(-1.0, —0.6) 6 0.1156 6.936 0.126 
(—0.6, —0.2) 10 0.1464 8.784 0.168 
(0.2, +0.2) 8 0.1586 9.516 0.242 
(+0.2, +0.6) 11 0.1464 8.784 0.559 
(+0.6, +1.0) 5 0.1156 6.936 0.540 
(+1.0, +1.4) 8 0.0779 4.674 2.367 
(*1.4, o9) 3 0.0808 4.848 0.704 
Totals 60 1.0 60 4.809 


The normal probabilities for each interval are obtained from the standard normal table 
of the cumulative distribution function ®(z), Figure 11.2, taking successive differences: 


P@, < Z< 2,) = Ф(2,) – Ф(21) 


These probabilities are multiplied by 60 to obtain the expected number in each interval, 
and the difference between the observed and expected number for each interval is 
squared and then divided by the expected number to give the contribution to the total 
chi-square: 
j m ( d _ ey 
= ý >=— = 481 
X=} T 
k=1 
This is small compared with 005 — 15.507, so the hypothesis of normality is accepted. 


It is unwise when applying this test in general to have many classes with expected 
numbers less than five, so the intervals in the tails of the histogram have been merged. 


Conclusions 


All the questions posed in Section 11.8.1 have now been answered. Engine B has an 
average running time that is significantly higher than that for engine A, showing that it 
has the advantage in fuel consumption. However, this statement requires qualification. 
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The running time for engine A depends upon ambient temperature. The temperature 
difference between the two test series was not significant, and does not account for the 
difference in average running times. However, engine B will only maintain its fuel 
advantage up to a certain point. This point has been estimated, but cannot be identified 
very precisely because of considerable residual scatter in the data. There are many 
potential sources of this scatter, such as errors in measuring out the fuel, or variations 
in the quantity and consistency of the engine oil. The scatter has a normal distribution, 
which justifies the statistics behind the conclusions reached. 


11.9 Engineering application: Reni Re iia ige) 


11.9.1 Introduction 


Every manufacturer recognizes the importance of quality, and every manufacturing pro- 
cess involves some variation in the quality of its output, however that is to be measured. 
Experience shows that tolerating a lack of quality tends to be more costly in the 
end than promoting a quality approach. It follows that quality control is a major and 
increasing concern, and methods of statistical quality control are more important than 
ever. The domain of these methods now extends to the construction and service indus- 
tries as well as to manufacturing — wherever there is a process that can be monitored in 
quantitative terms. 

Traditionally, quality control involved the accumulation of batches of manufactured 
items, the testing of samples extracted from these batches, and the acceptance or rejec- 
tion (with appropriate rectifying action) of these batches depending upon the outcome. 
The essential problem with this is that it is too late within the process: it is impossible 
to inspect or test quality into a product. More recently the main concern has been to 
design the quality into the product or service and to monitor the process to ensure that 
the standard is maintained, in order to prevent any deficiency. Assurance can then be 
formally given to the customer that proper procedures are in place. 

Control charts play an important role in the implementation of quality. The idea of 
these is introduced in Section 13.6 in Modern Engineering Mathematics, where Shewhart 
charts for counts of defectives are described. In order for this section to be as self- 
contained as possible, some of that material is repeated here. This section then covers 
more powerful control charts and extends the scope of what they monitor. 

First note that there are two main alternative measures of quality: attribute and 
variable. In attribute measure, regular samples from the process are inspected and for 
each sample the number that fail according to some criterion is plotted on a chart. In 
variable measure, regular samples are again taken, but this time the sample average for 
some numerical measure (such as dimension or lifetime) is plotted. 


11.9.2 Shewhart attribute control charts 


Figure 11.28 is an example of a Shewhart control chart: a plot of the count of ‘defectives’ 
(the number in the sample failing according to some chosen criterion) against sample 
number. It is assumed that a small (specified) proportion of ‘defective’ items in the 
process is permitted. Also shown on the chart are two limits on the count of defectives, 


Figure 11.28 
Attribute control chart 
for Example 11.25. 


Example 11.25 
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Action limit 








Warning limit 





Count of defectives 


0 2 4 6 8 10 12 14 16 18 20 
Sample number 


corresponding to probabilities of one in 40 and one in 1000 of a sample count falling 
outside if the process is ‘in control’; that is, conforming to the specification. These are 
called warning and action limits respectively, and are denoted by cw and c4. 

Any sample point falling outside the action limit would normally result in the pro- 
cess being suspended and the problem corrected. Roughly one in 40 sample points will 
fall outside the warning limit purely by chance, but if this occurs repeatedly or if there 
is a clear trend upwards in the counts of defectives then action may well be taken before 
the action limit itself is crossed. 

To obtain the warning and action limits, we use the Poisson approximation to the 
binomial. If the acceptable proportion of defective items is p, usually small, and the 
sample size is n then for a process in control the defective count C, say, will be a 
binomial random variable with parameters n and p. Provided that n is not too small, the 
Poisson approximation can be used (Section 11.7.2): 


оо К —пр 
Р(С > с) = У (p-e 
2 ! 


Equating this to - and then to ;4; gives equations that can be solved for the warning 
limit cy and the action limit c, respectively, in terms of the product np. This is the basis 
for the table shown in Figure 11.29, which enables cy and c, to be read directly from 


the value of np. 


Regular samples of 50 are taken from a process making electronic components, for 
which an acceptable proportion of defectives is 5%. Successive counts of defectives in 
each sample are as follows: 


7 8 9 10 11 12 13 14 15 16 17 18 19 20 
442 6 7 4 5 5 8 6 5 9 7 8 





At what point would the decision be taken to stop and correct the process? 


966 APPLIED PROBABILITY AND STATISTICS 


Solution 


Figure 11.29 
Shewhart attribute 
control limits: и 15 
sample size, p is 
probability of defect, 
Cw is warning limit and 
Ca is action limit. 


The control chart is shown in Figure 11.28. From np = 2.5 and Figure 11.29 we have the 
warning limit cy = 5.5 and the action limit c, = 8.5. The half-integer values are to avoid 
ambiguity when the count lies on a limit. There are warnings at samples 6, 10, 11, 15 and 
16 before the action limit is crossed at sample 18. Strictly, the decision should be taken 
at that point, but the probability of two consecutive warnings is less than one in 1600 
by the product rule of probabilities, which would justify taking action after sample 11. 





Cw OF Ca np for cw np for c, 
1.5 «0.44 «0.13 
2.5 0.44—0.87 0.13-0.32 
9 0.87—1.38 0.32—0.60 
4.5 1.38-1.94 0.60—0.94 
2:5 1.94—2.53 0.94—1.33 
6.5 2.53-3.16 1.33-1.77 
7.5 3.16-3.81 1.77-2.23 
8.5 3.81—4.48 2.23-2.73 
9.5 4.48—5.17 2.73-3.25 
10.5 5.17-5.87 3.25-3.79 
11.5 5.87—6.59 3.79—4.35 
12.5 6.59—7.31 4.35—4.93 
13.5 7.31—8.05 4.93—5.52 
14.5 8.05—8.80 5.52-6.12 
15.5 8.80—9.55 6.12-6.74 
16.5 9.55—10.31 6.74—7.37 
17.5 10.31-11.08 7.37-8.01 
18.5 11.08-11.85 8.01-8.66 
19:5 11.85-12.63 8.66-9.31 
20.5 12.63-13.42 9.31-9.98 
21.5 13.42-14.21 9.98—10.65 
22:5 14.21-15.00 10.65-11.33 
23:5 15.00—15.80 11.33-12.02 
24.5 15.80—16.61 12.02-12.71 
25.5 16.61-17.41 12.71-13.41 
26.5 17.41—18.23 13.41—14.11 
27.5 18.23-19.04 14.11-14.82 
28.5 19.04-19.86 14.82-15.53 
29.5 19.86-20.68 15.53-16.25 
30.5 16.25-16.98 
31:5 16.98-17.70 
32.5 17.70—18.44 
33.5 18.44—19.17 
34.5 19.17-19.91 
35.5 19.91-20.66 


An alternative practice (especially popular in the USA) is to dispense with the warn- 
ing limit and to set the action limit (called the upper control limit, UCL) at three 
standard deviations above the mean. Because the count of defectives is binomial with 
mean np and variance np(1 — p), this means that 


UCL = np + З {[пр(1 – р)] 


Example 11.26 


Solution 


11.9.3 


Example 11.27 


Solution 
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Find the UCL and apply it to the data in Example 11.25. 


From n= 50 and p = 0.05 we infer that UCL = 7.1, which is between the warning limit 
Cw and the action limit c, in Example 11.25. The decision to correct the process would 
be taken after the 15th sample, the first to exceed the UCL. 


Shewhart variable control charts 


Suppose now that the appropriate assessment of quality involves measurement on a 
continuous scale rather than success or failure under some criterion. This arises when- 
ever some dimension of the output is critical for applications. Again we take samples, but 
this time we measure this critical dimension and average the results. The Shewhart chart 
for this variable measure is a plot of successive sample averages against sample number. 
The warning and action limits cy and c, on a Shewhart chart are those points for 
which the probabilities of a false alarm (where the result exceeds the limit even though 
the process is in control) are one in 40 and one in 1000 respectively. For variable measure 
the critical quantity can be either too high or too low, so the sample average must be tested 
in each direction with the stated probability of exceedance for each limit. It follows that 
the limits are determined by 
P(X > My + cw) = Р(Х < ux- Cw) 7 3; 
PÆ > My + eq) = POR < Ur- ca) = 


where X is the sample average and Ly the design mean. 
Provided that the sample size n is not too small, the central limit theorem allows the 
sample average to be assumed normal (Section 11.3.2), 


Х= Ми, оў/п) 
and the normal distribution table (Figure 11.2) then gives 


_ 1960y _ 3.090; 


үп үп 








Cw > CA 


Measurements of sulphur dioxide concentration (in ug m°) in the air are taken daily at 
five locations, and successive average readings are as follows: 


64.2, 56.9, 57.7, 67.9, 61.7, 59.7, 55.6, 63.7 
58.3, 66.4, 67.2, 65.2, 63.1, 67.6, 64.1, 66.7 


It is suspected that the mean increased during that time. Assuming normal data with a 
long-term mean of 60.0 and standard deviation of 8.0, investigate whether an increase 
occurred. 


From n= 5 and oy= 8 we have cy = 7.0 and c, = 11.1 (Figure 11.30). The warning limit 
is 67.0, which is exceeded by sample numbers 4, 11 and 14. The action limit is 71.1, 
which is not exceeded. The readings are suspiciously high — but not sufficiently so for 
the conclusion to be justified. 


968 APPLIED PROBABILITY AND STATISTICS 


Figure 11.30 
Variable control chart 
for Example 11.27. 


11.9.4 


Concentration/ug m? À 


Action limit 






Warning limit 







64 





60 


Sample number 


56 T 


52+ Warning limit 


48 Action limit 


+ 





As discussed in Section 11.9.2, the practice in the USA is somewhat different: there 
are no warning limits, only action limits at three standard deviations on either side of 
the mean. For a variable chart this allows a deviation from the mean of at most 3oy//n, 
which is very close to the action limit usually used in the UK. 


Cusum control charts 


The main concern in designing a control chart is to achieve the best compromise between 
speedy detection of a fault on the one hand and avoidance of a proliferation of false alarms 
on the other. If the chart is too sensitive, it will lead to a large number of unnecessary 
shutdowns. The Shewhart charts, on the other hand, are rather conservative in that they 
are slow to indicate a slight but genuine shift in performance away from the design level. 
This derives from the fact that each sample point is judged independently and may well 
lie inside the action limits, whereas the cumulative evidence over several samples might 
justify an earlier decision. Rather informal methods involving repeated warnings and trends 
are used, but it is preferable to employ a more powerful control chart. The cumulative 
sum (cusum) chart achieves this, and is easily implemented on a small computer. 

Suppose that we have a sequence ( Y;, Y;,. . . ) of observations, which may be either 
counts of defectives or sample averages. From this a new sequence {Sp, S,,... } 
is obtained by setting 


S,-max(0,S,,* Y,-r) (m=1,2,...) 


where r is a constant 'reference value'. This gives a cumulative sum of values of Y,, — r, 
which is reset to zero whenever it goes negative. The out-of-control decision is made when 


Sm > h 


Figure 11.31 Cusum 
attribute chart control 
data. 


Example 11.28 


Solution 
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np r h np Ё h 

0.22 1 1.5 2.35 4 4.5 
0.39 1 2:5 2.60 4 5.5 
0.51 2 1.5 2.95 5 4.5 
0.62 1 4.5 3.24 5 5.5 
0.69 1 5.5: 3.89 6 2:9 
0.79 2 2.5 4.16 6 6.5 
0.86 3 1.5 5.32 7 8.5 
1.05 2 3:5 6.07 8 8.5 
1.21 3 2:5 7.04 9 95 
1.52 3 3.5 8.01 10 10.5 
1.96 3 5.5 9.00 11 11.5 
2.16 5 2:5 10.00 12 12.5 





where A is a constant “decision interval’. This will detect an increasing mean; a separ- 
ate but similar procedure can be used to detect a decreasing mean. Values ofr and 
h for both attribute and variable types of control can be obtained from tables such as 
those in J. Murdoch, Control Charts (Macmillan, London, 1979), from which the attribute 
table in Figure 11.31 has been extracted. For variable measure (with process design 
mean [ly and standard deviation Oy) the following are often used: 

Ox 


O; 
r= lyt —, h-25-— 
2/n In 

N N 


Regular samples of 50 are taken from a process making electronic components, for which 
an acceptable proportion of defectives is 596. Successive counts of defectives in each 
sample are as follows: 


9 10 11 12 13 14 15 16 17 18 19 20 
2 6 7 4 5 5 8 6 5 9 7 8 





At what point would the decision be taken to stop and correct the process? 


The acceptable proportion of defectives is p = 0.05 and the regular sample size is 
n= 50. From the table in Figure 11.31, with np = 2.5 the nearest figures for reference 
value and decision interval are r = 4 and h = 5.5. The following shows the cusum S,, 
below each count of defectives Y,,, and the cusum is also plotted in Figure 11.32: 


8 6 5 9 7 8 


Count 5 5 
6 7 11 13 14 19 22 26 


Cusum 


— tA 
© м 
© N 
O = 
NO 
NA 
NA 





For example, 
§\,=S.+¥,-r=5+5-4=6 


and because this exceeds h = 5.5, the decision to take action would be made after the 
13th sample. This result can be compared with that of a Shewhart chart applied to the 
same data (Example 11.25), which suggests that action should be taken after 18 samples. 


970 APPLIED PROBABILITY AND STATISTICS 


Figure 11.32 
Cusum control chart 
for Example 11.28. 


Example 11.29 


Solution 


Figure 11.33 
Cusum control chart 
for Example 11.29. 


Sn 
25 


20 





О 2 4 6 8 10 12 14 16 18 20 
Sample number m 


Construct a cusum chart for the sulphur dioxide data in Example 11.27. 


From Ly = 60, Oy = 8 and n = 5 we have 


воа, Aces =179 
245 15 


ү 
The following table shows the sample average .X,, and cusum S, for 1 « т < 16, and 
the cusum is also plotted in Figure 11.33: 








Average 64.2 56.9 57.7 67.9 61.7 59.7 55.6 63.7 
Cusum 2.4 0 0 6.1 6.0 3.9 0 1.9 
Average 58.3 66.4 67.2 65.2 63.1 67.6 64.1 66.7 
Cusum 0 4.6 10.0 13.4 14.7 20.5 22.8 27.7 





Sn À 











} А > 
О 2 4 6 8 10 12 14 16 


Sample number m 


Because $,, = 20.5 exceeds A = 17.9, this chart suggests that the SO, concentration did 
increase during the experiment, a stronger result than that obtained from the Shewhart 
chart in Example 11.27. 
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It can be shown that the cusum method will usually detect an out-of-control con- 
dition (involving a slight process shift) much sooner than the strict Shewhart method, 
but with essentially the same risk of a false alarm. For instance, the cusum method 
leads to a decision after 13 samples in Example 11.28 compared with 18 samples in 
Example 11.25 for the same data. The measure used to compare the two methods is the 
average run length (ARL), which is the mean number of samples required to detect 
an increase in proportion of defectives (or process average) to some specified level. It 
has been shown that the ARL for the Shewhart chart can be up to four times that for the 
cusum chart (J. Murdoch, Control Charts, Macmillan, London, 1979). 


Moving-average control charts 


The cusum chart shows that the way to avoid the relative insensitivity of the Shewhart 
chart is to allow the evidence of a shift in performance to accumulate over several 
samples. There are also moving-average control charts, which are based upon a 
weighted sum of a number of observations. The best of these, which is very similar to 
the cusum chart in operation, is the geometric moving-average (GMA) chart. This 
will be described here for variable measure, but it also works for attribute measure 
(Exercise 56). 

Suppose that the successive sample averages are X,, X,,..., each from a sample of 
size n. Also suppose that the design mean and variance are Uy and 02. Then the GMA 
is the new sequence given by 


So = Ux 
Sm = Х„+(1—)5„ (m=1,2,...) 
where 0 < r < | is a constant. The statistical properties of this sequence are simpler 


than for the cusum sequence. First, by successively substituting for S,,_;, S,,-.and so on, 
we can express S,, directly in terms of the sample averages: 


m-1 
S,2r Уу 1-0), +01 п)" 
i-0 
Then, using the summation formula 


m-l m 
lax 





(x| < 1) 


it is easy to show (Exercise 57) that the mean and variance of S,, are 
Ls, = Е(5„) = Hx 


2 
6; - Var($,) 2 ——[1 - (1 - ry] 9X 
m 2-r n 


After the first few samples the variance of S,, tends to a constant value: 


2 

©? - (5) as m — оо 
m 2-r/ n 

If US practice is followed then the upper and lower control limits can be set at 

(Uy + 305 ). If UK practice is followed then, from the approximate normality of the 

sample averages and the fact that sums of normal random variables are also normal, it 

follows that S,, is approximately normal, so the warning and action limits can be set at 
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Example 11.30 


Solution 


Figure 11.34 Moving- 
average control chart 
for Example 11.30: 

(a) r = 0.2; (b) r = 0.4. 


(Uy 1.960; ) and (Uy + 3.090; ) respectively (although the warning limits now have 
less significance). 

It remains to choose a value for r. If we set r = 1, the whole approach reduces to the 
standard Shewhart charts. Small values of r (say around 0.2) lead to early recognition 
of small shifts of process mean, but ifr is too small, a large shift may remain undetected 
for some time. 


Construct GMA charts for the sulphur dioxide data in Example 11.27, using r = 0.2 
and r = 0.4. 


The control charts can be seen in Figure 11.34. Clearly the warning and action limits 
converge fairly quickly to constant values, so little is lost by using those values in 
practice. The warning limit is exceeded from sample 11 for both values of r. The action 
limit is exceeded from sample 14 for r = 0.2 (as for the cusum chart in Example 11.29) 
and at sample 16 for r = 0.4. 









































Sm 
64 Action 
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60 
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Sample number m 

58 

Warning 
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$5 
66 Action 
64 Warning 
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56 Warning 
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(b) 


48 


49 


11.9.6 
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Range charts 


The sample range is defined as the difference between the largest and smallest values 
in the sample. The range has two functions in quality control where the quality is of 
the variable rather than the attribute type. First, if the data are normal then the range 
(R, say) provides an estimate ô of the standard deviation o by 


6 — R/d 
where d is a constant that depends upon the sample size n as follows: 


2 3 4 5 6 7 8 9 10 11 12 
1128 1.693 2.059 2.326 2.534 2.704 2.847 2.970 3.078 3.173 3.258 





It is clearly quicker to evaluate this than the sample standard deviation S, and for the 
small samples typically used in quality control the estimate is almost as good. 

The other reason why the range is important is because the quality of production can 
vary in dispersion as well as (or instead of) in mean. Control charts for the range R are 
more commonly used than charts for the sample standard deviation S when monitoring 
variability within the manufacturing process, and all three types of chart discussed 
above (Shewhart, cusum and moving-average) can be applied to the range. Range 
charts (or R charts) are designed using tables that can be found in specialized books 
on quality control, for example D. C. Montgomery, Introduction to Statistical Quality 


Control, 2nd edn, Wiley, New York, 1991. 


11.9.7 Exercises 


It is intended that 90% of electronic devices 50 
emerging from a machine should pass a simple 
on-the-spot quality test. The numbers of defectives 

among samples of 50 taken by successive shifts are 

as follows: 


5, 8, 11, 5, 6,4, 9, 7, 12, 9, 10, 14 


Find the action and warning limits, and the sample 
number at which an out-of-control decision is 
taken. Also find the UCL (US practice) and the 
sample number for action. 


Thirty-two successive samples of 100 castings 
each, taken from a production line, contained the 
following numbers of defectives 


oH 


3,3,5,3,5,0,3,1,3, 5, 4,2, 4, 3, 5,4 

3, 4, 5, 6, 5, 6, 4, 4, 7, 5, 4, 8, 5, б, 6,7 

If the proportion that are defective is to be 
maintained at 0.02, use the Shewhart method 
(both UK and US standards) to indicate whether 
this proportion is being maintained, and if not then 
give the number of samples after which action 
should be taken. 


A bottling plant is supposed to fill bottles with 
568 ml (one imperial pint) of liquid. The standard 
deviation of the quantity of fill is 3 ml. Regular 
samples of 10 bottles are taken and their contents 
measured. After subtracting 568 from the sample 
averages, the results are as follows: 


—0.2, 1.3, 2.1, 0.3, —0.8, 1.7, 1.3, 0.6, 2.5, 
1.4, 1.6, 3.0 


Using a Shewhart control chart, determine whether 
the mean fill requires readjustment. 


Average reverse-current readings (in nA) for 
samples of 10 transistors taken at half-hour 
intervals are as follows: 


12.8, 11,2, 13.4, 12.1, 13.6, 13.9, 12.3, 12.9, 
13.8, 13.1, 12.9, 14.0, 13.7, 13.4, 14.2, 13.1, 
14.0, 14.0, 15.1, 14.3 


The standard deviation is 3 nA. At what point, if 
any, does the Shewhart control method indicate that 
the reverse current has increased from its design 
value of 12 nA? 
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52 


53 


54 


55 


56 


Using the data in Exercise 50, apply (a) a cusum 57 Prove that the mean and variance of the geometric 
control chart and (b) a moving-average control moving-average S,, defined in Section 11.9.5 for 
chart with r = 0.3. variable measure are given by 
Using the data in Exercise 51, apply (a) a cusum (Sm) = Hx 
2 2 
pue oe d a moving-average control oi = Var(S,) = oo 1-01-67" e 
Apply a cusum control chart to the data in 58 . Suppose that the moving-average control chart is to 
Exercise 48. be applied to the counts of defectives in attribute 
quality control. Find the mean and variance of S,, 
Apply a cusum control chart to the data in in terms of the sample size n, the design proportion 
Exercise 49. of defectives p and the coefficient r. Following 
US practice, set the upper control limit at three 
The diameters of the castings in Exercise 49 are standard deviations above the mean, and apply the 
also important. Twelve of each sample of 100 were method to the data in Example 11.28, using r = 0.2. 
taken, and their diameters measured and averaged. 
The differences (in mm) between the successive 59 The design diameter of a moulded plastic 


averages and the design mean diameter of 125 mm 
were as follows: 


0.1, 0.3, —0.2, 0.4, 0.1, 0.0, 0.2, —0.1, 0.2, 
0.4, 0.5, 0.1, 0.4, 0.6, 0.3, 0.4, 0.3, 0.6, 0.5, 
0.4, 0.2, 0.3, 0.5, 0.7, 0.3, 0.1, 0.6, 0.5, 0.6, 
0.7, 0.4, 0.5 


Use (a) Shewhart, (b) cusum and (c) moving- 
average (with r = 0.2) control methods to test for 
an increase in actual mean diameter, assuming a 
standard deviation of 1 mm. 


component is 6.00 cm, with a standard deviation 
of 0.2 cm. The following data consist of successive 
averages of samples of 10 components: 


6.04, 6.12, 5.99, 6.02, 6.04, 6.11, 5.97, 6.06, 
6.05, 6.06, 6.17, 6.03, 6.13, 6.05, 6.17, 5.97, 
6.07, 6.14, 6.03, 5.99, 6.10, 6.01, 5.96, 6.12, 
6.02, 6.20, 6.11, 5.98, 6.02, 6.12 


After how many samples do the Shewhart, cusum 
and moving-average (with r = 0.2) control methods 
indicate that action is needed? 


ARNE Poisson processes and the theory of queues 


11.10.1 


Probability theory is often applied to the analysis and simulation of systems, and this 
can be a valuable aid to design and control. This section, which is therefore applied 
probability rather than statistics, will illustrate how this progresses from an initial math- 
ematical model through the analysis stage to a simulation. 


Typical queueing problems 


Queues are everywhere: in banks and ticket offices, at airports and seaports, traffic 
intersections and hospitals, and in computer and communication networks. Somebody 
has to decide on the level of service facilities. The problem, in essence, is that it is 
costly to keep customers waiting for a long time, but it is also costly to provide enough 
service facilities so that no customer ever has to wait at all. Queues of trucks, aero- 
planes or ships may be costly because of the space they occupy or the lost earnings 
during the idle time. Queues of people may be costly because of lost productivity or 
because people will often go elsewhere in preference to joining a long queue. Queues 


Figure 11.35 
A typical queueing 
system. 


11.10.2 


Figure 11.36 
Random events (x) 
on a time axis. 
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Arrivals 


Departures 
Service 


channels 


of jobs or packets in computer networks are costly in loss of time-efficiency. Service 
facilities are costly in capital, staffing and maintenance. Probabilistic modelling, often 
combined with simulation, allows performance evaluation for queues and networks, 
which can be of great value in preparing the ground for design decisions. 

The mathematical model of a simple queueing system is based on the situation shown 
in Figure 11.35. Customers join the queue at random times that are independent of 
each other — the inter-arrival time (between successive arrivals) is a random variable. 
When a service channel is free, the next customer to be served is selected from the 
queue in a manner determined by the service discipline. After being served the customer 
departs from the queueing system. The service time for each customer is another 
random variable. The distributions of inter-arrival time and service time are usually 
assumed to take one of a number of standard patterns. The commonest assumption 
about service discipline is that the next customer to be served is the one who has been 
queueing the longest time (first in, first out). 

The queueing system may be regarded from either a static or a dynamic viewpoint. 
Dynamically, the system might start from an initial state of emptiness and build up with 
varying rates of arrivals and varying numbers of service channels depending upon 
queue length. This is hard to deal with mathematically, but can be treated by computer 
simulation. Useful information about queues can, however, be obtained from the static 
viewpoint, in which the rate at which arrivals occur is constant, as is the number of 
service channels, and the system is assumed to have been in operation sufficiently long 
to have reached a steady state. At any time the queue length will be a random variable, 
but the distribution of queue length is then independent of time. 

We need to find the distributions of queue length and of waiting time for the cus- 
tomer, and how these vary with the number of service channels. Costs can be worked 
out from these results. 


Poisson processes 


Consider the arrivals process for a queueing system. We shall assume that the customers 
join the queue at random times that are independent of each other. Other assumptions 
about the pattern of arrivals would give different results, but this is the most common 
one. We can therefore think of the arrivals as a stream of events occurring at random 
along a time axis, as depicted in Figure 11.36. The inter-arrival time T, say, will be 
a continuous random variable with probability density function f,(4) and cumulative 
distribution function F; (f). 


Time 
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Example 11.31 


Solution 


One way to formulate the assumption of independent random arrivals is to assert that 
at any moment the distribution of the time until the next arrival is independent of the 
time elapsed since the previous arrival (because arrivals are ‘blind’ to each other). This 
is known as the memoryless property, and can be expressed as 


P(T<t+h|T>H)=P(T<h) (t,h=0) 


where f denotes the actual time since the previous arrival and // denotes a possible time 
until the next arrival. Using the definition of conditional probability (Section 11.2.1), 
we can write this in terms of the distribution function F;(f) as 


P(t « Tx Л py 


I -P(T « t) 1- (0) 
Rearranging at the second equality and then dividing through by A gives 


Fr(h 
HE Ce - FO012 202r - E] 
Letting h — 0, we obtain a first-order linear differential equation for F;(f): 
Чай = AD - FJ 
dt 
where 
A= lim EO) 


h0 


With the initial condition F,(0) 2 0 (because inter-event times must be positive), the 
solution is 


F,()=1-e™ (tz 0) 


and hence the probability density function is 
ла) = (0) Ае" (12 0) 


This is the density function of an exponential distribution with parameter A, and it 
follows that the mean time between arrivals is 1/A (see Section 11.7.1). The parameter 
A is the rate of arrivals (number per unit time). 


A factory contains 30 machines of a particular type, each of which breaks down every 
100 operating hours on average. It is suspected that the breakdowns are not independ- 
ent. The operating time intervals between 10 consecutive breakdowns (of any machine) 
are measured and the shortest such interval is only six minutes. Does this lend support 
to the suspicion of non-independent breakdowns? 


Collectively, the machines break down at the rate of 30/100 or 0.3 per hour. If the 
breakdowns are independent then the interval between successive breakdowns will 
have an exponential distribution with parameter 0.3. The probability that such an interval 
will exceed six minutes is 


Example 11.32 


Solution 
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оо 


P(interval > 0.1) = | 0.3e°"dt=e °°” = 0.9704 


0.1 


and the probability that all nine intervals (between 10 breakdowns) will exceed this 
time is (0.9704)? = 0.763. Hence the probability that the shortest interval will be six 
minutes or less is one minus this, or 0.237. This is quite likely to have happened by 
chance, so it does not support the suspicion of non-independent intervals. 


The assumption of independent random arrivals therefore leads to a particular dis- 
tribution of inter-arrival time, parametrized by the rate of arrivals. Two further 
conclusions also emerge. First, the number of arrivals that occur during a fixed interval 
of length H has a Poisson distribution with parameter Л.Н: 


k —AH 
P(k arrivals during interval of length Н) = GELT (kK=0,1,2,...) 


This will not be proved here, but is easily seen to be consistent with an exponential 
distribution of inter-arrival time T because 


F,(t)=P(T <th=1-P(T> 1) 
— ] - P (no event during interval of length /) 
=l-e” 
using the Poisson distribution. Because of this distribution, events conforming to these 
assumptions are known as a Poisson process. 
The other conclusion is that the probability that an arrival occurs during a short 
interval of length A is equal to Ah + O(h’), regardless of the history of the process. 


Suppose that a time ¢ has elapsed since the previous arrival, and consider a short interval 
of length A starting from that point: 


P(arrival during (t, t -- 1) 2 P(T € tc h|T 7 г) = F4(h) 
=1-e“=Ah+ O(h’) 


using the memoryless property and the expansion of e^" to first order. Furthermore, the 
probability of more than one arrival during a short interval of length h is O(h’). 


A computer receives on average 60 batch jobs per day. They arrive at a constant rate 
throughout the day and independently of each other. Find the probability that more than 
four jobs will arrive in any one hour. 


The assumptions for a Poisson process hold, so the number of jobs arriving in one hour 
is a Poisson random variable with parameter AH = 60/24. Hence 


P(more than four jobs) = 1 — P(0 or 1 or 2 or 3 or 4 jobs) 


2 3 4 
2]-e4/ 1+Ан+ АЛ) „АЛУ ОЛ = 0.109 
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11.10.3 


Single service channel queue 


Consider a queueing system with a Poisson arrival process with mean rate A per unit 
time, and a single service channel. The behaviour of the queueing system depends not 
only on the arrival process but also upon the distribution of service times. A common 
assumption here is that the service time distribution (like that of inter-arrival time) is 
exponential. Thus the probability density function of service time S is 


fs(s)=He™ (s = 0) 


Unlike the inter-arrival time distribution in Section 11.10.2, this is not based on an 
assumption of independence or the memoryless property, but simply on the fact that in 
many queueing situations most customers are served quickly but a few take a lot longer, 
and the form of the distribution conforms with this fact. This assumption is therefore 
on much weaker ground than that for the arrival time distribution. The parameter Lt is 
the mean number of customers served in unit time (with no idle periods), and the mean 
service time is 1/1. With this service distribution, the probability that a customer in the 
service channel will have departed after a short time / is equal to uh + O(h’), independent 
of the time already spent in the service channel. 


Distribution of the number of customers in the system 


We can now derive the distribution of the number of customers in the queueing system. 
Considering the system as a whole (queue plus service channel), the number of cus- 
tomers in the system at time f is a random variable. Let p,(t) be the distribution of this 
random variable: 


P,{t) = P(n customers in the system at time f) (n=0, 1, 2,...) 


Consider the time t + A, where h is small. The probability of more than one arrival or 
more than one departure during this time is O(h°), and will be ignored. There are four 
ways in which there can be n (assumed greater than zero) customers in the system at 
that time: 


(1) there are n in the system at t, and no arrival or departure by t + h; the probability 
of this is given by 


POC -AWA – рл) + О(?) = р(0(1 — Аһ – uh) + О(?) 


(2) there are n in the system at t, and one arrival and one departure by t + h; the 
probability is given by 


PAG) Gu) + O(h*) = OH) 


(3) there are n — | in the system at f, and one arrival but no departure by t + h; the 
probability is given by 


Pralt(Ah(l = wh) + OH) = pA) + Oh) 


(4) there are n + 1 in the system at f, and no arrivals but one departure by f + h; the 
probability is given by 


Pral — Ah) Uh) + OH) = p, (uh) + Oh") 
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Summing the probabilities of these mutually exclusive events gives the probability of 
n customers in the system at time źt + A as 


Pat + h) 2 p()(1 — Ah — uh) * p, (Ah) 
+P OUA) + Oh?) (n=1,2,...) (11.1) 
Similarly, there are two ways in which the system can be empty (n — 0) at time ¢ + A: 
empty at ¢ and no arrival before t + h, or one customer at ¢ who departs before t + A. 
This gives 
Polt + h) = рд(1 – А) + р(д(ић) + О() (11.2) 
Rearranging equations (11.1) and (11.2) and taking the limit as A > 0, we obtain 


£ pall) = tim 7 Lea(t+ A) ~ p,(0)] 


h0 


--( p) Ap) * up) (1 1,2,...) 
а Рб) = Ара) + рд) 


This is a rather complex set of recursive differential equations for the probabilities р, (2). 
If we assume that the arrival and service parameters À and u are constant and that the 
system has been in operation for a long time then the distribution will not depend upon t; 
the derivatives therefore vanish, and we are left with the following algebraic equations 
for the steady-state distribution p,: 


0--(A*I)p,* Ap. t іры (п=1,2,...) 
0 = -Àp,* Upi 
Defining the ratio of arrival and service parameters A and Li as р = A/u and dividing 
through by и, we have 
Рт = (1+ р)р,– рр (n=1,2,...) 
Pı = рро 


To solve these, we first assume that p,— p"p,. Clearly this works for n 2 0 and n 2 1. 
Substituting, 


ntl 


Pm = (1+ p)p"Po- pp” ‘Po = p""!p, 


so the assumed form holds for n — 1, and therefore for all 1 by induction. It remains only 
to identify p, from the fact that the distribution must sum to unity overn 20, 1,2, ...: 


n Do 
pep. posee 
2" i 
Hence p = 1 — p and 
D,-(»-p)p" (n20,1,2,...) 


This is known as the geometric distribution, and is a discrete version of the exponen- 
tial distribution (Figure 11.37). Note that this result requires that p — 1, or equivalently 
A < u. If this condition fails to hold, the arrival rate swamps the capacity of the service 
channel, the queue gets longer and longer, and no steady-state condition exists. 
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Figure 11.37 
Geometric distribution 
(with p = 0.75). 
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Queue length and waiting time 
The queue length distribution now follows easily: 
P(queue empty) = pot pi 
=1—р?° 
P(n in queue) = P(n + | in system) 
=(l-p)p™' (n=1,2,...) 


Denoting the mean numbers of customers in the system and in the queue by Ng and No 
respectively, 


N= $ npo B No-YMa- Dp. 2 
п=0 п=1 





(Exercise 62). Since in the steady state the mean time between departures must equal 
the mean time between arrivals (1/A), it is plausible that the mean total time in the 
system for each customer, Wg say, is given by 


W, = mean number in system x mean time between departures 
-PA l 
l-p u-A 
The mean waiting time in the queue, Wg say, is then 
Wo = mean time in system — mean service time 
ыра 
и u-À 
These results for W; and Wọ can be derived more formally from the respective waiting 
time distributions. For example, the distribution of total time in the system can be 


shown to be exponential with parameter u — A, and the waiting time in the queue can 
be expressed as 


P(waiting time in queue 2 f) 2 pe "^ (t> 0) 


Example 11.33 


Solution 


Example 11.34 


Solution 


11.10 POISSON PROCESSES AND THE THEORY OF QUEUES 981 


If customers in a shop arrive at a single check-out point at the rate of 30 per hour and if 
the service times have an exponential distribution, what mean service time will ensure 
that 80% of customers do not have to wait more than five minutes in the queue and 
what will be the mean queue length? 


With A = 0.5 and t= 5, the queue waiting time gives 
0.22 pe U^ 

that is, 
0.26 = 0.5 еги 


This is a nonlinear equation for и, which may be solved by standard methods to give 
и = 0.743. The mean service time is therefore 1/u or 1.35 min, and the mean queue 
length is 





No= #— = 1.39, using p - 4 = 0.673 


Handling equipment is to be installed at an unloading bay in a factory. An average of 
20 trucks arrive during each 10h working day, and these must be unloaded. The 
following three schemes are being considered: 





Scheme Fixed cost/ Operating cost/ Mean handling rate/ 
£ per day £ per hour trucks per hour 

A 90 45 3 

B 190 50 4 

C 450 60 6 


Truck waiting time is costed at £30 per hour. Assuming an exponential distribution of 
truck unloading time, find the best scheme. 


Viewing this as a queueing problem, we have 

A = arrival rate per hour = 2.0 

= unloading rate per hour 

mean waiting time for each truck = 1/(u — A) 
Hence the mean delay cost per truck is 

_30_ 

й=2 
and the mean delay cost per day is 

20 x 30 

и- 2 
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11.10.4 


Example 11.35 


The proportion of time that the equipment is running is equal to the probability that the 
system is not empty (the utilization), which is 


A 
l-p == = 
puc 


Hence the mean operating cost per day is 10p times operating cost per hour. The total 
cost per day (in £) is the sum of the fixed, operating and delay costs, as follows: 


Scheme Hu p Fixed Operating Delay Total 





A 3 0.6667 90 300 600 990 
B 4 0.5 190 250 300 740 
C 6 0.3333 450 200 150 800 


Hence scheme B minimizes the total cost. 


Queues with multiple service channels 


For the case where there are c service channels, all with an exponential service time 
distribution with parameter и, a line of argument similar to that in Section 11.10.3 can 
be found in many textbooks on queueing theory. In particular, it can be shown that the 
distribution p, of the number of customers in the system 1s 


Pp»  (0<п=с) 
п! 
P=) y 
Po (n>c) 


n-c 
! 





where p = A/u and 
с-1 п g = 
p p 


mz СОК 


п= 


The mean numbers in the queue and in the system аге 


ctl 


p 
N= — =p, М№= №+р 
"o(e-DWe-py t 7 
and the mean waiting times in the queue and in the system are 
N, 1 
Ио = T W,- Ho. 


For the unloading bay problem in Example 11.34 a fourth option would be to install 
two sets of equipment under scheme A (there is space available to do this). The fixed 
costs would then double but the operating costs per bay would be the same. Evaluate 
this possibility. 


Solution 


11.10.5 
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With two bays under scheme A, we have A= 2, u = 3 and c = 2, so that p — 2 , and the 
probability that the system 1s empty at any time is 


2 -1 
po=(1+p+ E =; 





= 
The probabilities of one truck (one bay occupied) and of two or more trucks (both bays 
occupied) are then 
Р = рро = 1 


P(two or more trucks) = 1 - l 


11 

2 3 6 
The total operating cost per day is the operating cost for when one or other bay is working 
(£45 per hour) plus that for when both bays are working (£90 per hour), which is 


10[ (45) + 1 (90)] = 300 


The mean number in the queue is 
3 


(2-р) 
so that the mean total time in the system for each truck is 


5 (0.083 33) + + = 0.375 





Po = 0.083 33 


2 


Multiplying by the cost per hour and the number of trucks gives the delay cost per day: 
20(30)(0.375) = 225 

The total cost per day of this scheme is therefore 
2(90) + 300 + 225 = £705 


This is less than the £740 under scheme B, the best of the single-bay options. 


Queueing system simulation 


The assumption that the service time distribution is exponential, which underlies the 
results in Sections 11.10.3 and 11.10.4, is often unrealistic. It is known that it leads to 
predicted waiting times that tend to be pessimistic, as a result of which costs based 
on these predictions are often overestimated. Theoretical results for other service dis- 
tributions exist (see for example E. Page, Queueing Theory in OR. Butterworth, London, 
1972), but it is often instructive to simulate a queueing system and find the various 
answers numerically. It is then easy to vary the arrival and service distributions, and the 
transient (non-steady-state) behaviour of the system also reveals itself. 

Figure 11.38 shows a pseudocode listing of a single-channel queueing system simula- 
tion, which is easily modified to cope with multiple channels. Each event consists of either 
an arrival or a departure. The variables next arrival and next departure are used to 
represent the time to the next arrival and the time to the next departure respectively, 
and the type of the next event is determined by whichever is the smaller. New arrival 
and departure times are returned by the functions arrival time and departure time, 
which generate values from appropriate exponential distributions. The distribution of 
the number of customers in the system is built up as an array, normalized at the end of 
the simulation, from which the mean and other results can be obtained. 
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Figure 11.38 
Pseudocode listing 
for queueing system 
simulation. 


{ Procedure to simulate a single-channel queueing system. 

time is the running time elapsed, 

limit is the simulation length, 

number is the number of items in the system, 

maximum is the maximum number allowed in the system, 

system[i] is the distribution (array) of times with i in the 
system, assumed initialized to zero, 

mean is the mean number in the system, 

infinity contains a very large number, 

arrival rate is the arrival distribution parameter, 

service rate is the service distribution parameter, 

next arrival is the time to the next arrival, 

next departure is the time to the next departure, 

rnd() is a function assumed to return a uniform random (0,1) 
number. 

The simulation starts with the system empty, 

limit, maximum, arrival rate and service rate must be set. } 


бте < 0 
next arrival «— arrival time(arrival rate) 
next departure < infinity 
number — 0 
repeat 
if next arrival < next departure then 
time < time - next. arrival 


system[number] < system[number] 4 next arrival 
arrival() 


else 
бте < time 4 next departure 
system[number] «— system[number] -- next departure 
departure() 
endif 
until time > limit 
теап < 0 
for iis 0 to maximum do 
system[i] — system[i]/time 
mean < mean + i* system[i] 
endfor 
LLLLLLLLLLLLLLLLLLLII 








procedure arrival() 
{procedure to handle an arrival, 
changes the values of next arrival, next departure, and number | 
if number = 0 then 
next_departure < departure_time(service_rate) 
else 
next departure «— next departure — next. arrival 
endif 
number <+ number + | 
if number = maximum then 
next arrival < infinity 
else 
next_arrival < arrival_time(arrival_rate) 
endif 
endprocedure 
Ж 
procedure departure() 
{ procedure to handle a departure, 
changes the values of next arrival, next departure, and number  j 


Figure 11.38 
continued 
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if number = maximum then 
next_arrival < arrival time(arrival rate) 
else 
next arrival €— next. arrival — next. departure 
endif 
number «— number — 1 
if number = 0 then 
next departure — infinity 
else 
next departure «— departure time(service rate) 
endif 
endprocedure 
LLLLLLLLLLLLLLLLLLLII 
function arrival time(arrival rate) 
{ function to generate a new arrival time } 
U< md) 
return( — (log(U))/arrival_rate) 
endfunction 
function departure_time(service_rate) 
{ function to generate a new departure time } 
U c rnd() 
return ( — (log/U)y'service rate) 
endfunction 


What is typically found from such a simulation (with Poisson arrivals and an exponen- 
tial service distribution) is that there is good agreement with the predicted results as long 
as p = A/L is small, but the results become more erratic as p — 1. It takes a very long 
time for the distribution to reach its steady-state form when the value of p is close to 
unity. In that situation the theoretical steady-state results may be of limited value. 


60 


61 


62 


11.10.6 Exercises 


A sea area has on average 15 gales annually, evenly 
distributed throughout the year. Assuming 
that the gales occur independently, find the 
probability that more than two gales will occur 

in any one month. 


Suppose that the average number of telephone calls 
arriving at a switchboard is 30 per hour, and that 
they arrive independently. What is the probability 
that no calls will arrive in a three-minute period? 
What is the probability that more than five calls will 
arrive in a five-minute period? 


Show that for a single-channel queue with Poisson 
arrivals and exponential service time distribution 
the mean numbers of customers in the system and in 
the queue are 





N= Ngo = p 


, 


p 1-р 
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where p is the ratio of arrival and service rates. 
(Hint: Differentiate the equation 


c n 1 
p= as 
Р 7-р 
with respect to p.) 


Patients arrive at the casualty department of a 
hospital at random, with a mean arrival rate of 
three per hour. The department is served by one 
doctor, who spends on average 15 minutes with 
each patient, actual consulting times being 
exponentially distributed. Find 


(a) the proportion of time that the doctor is idle; 
(b) the mean number of patients waiting to 
see the doctor; 
(c) the probability of there being more than three 
patients waiting; 
(d) the mean waiting time for patients; 
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65 


(e) the probability of a patient having to wait 
longer than one hour. 


A small company operates a cleaning and 
re-catering service for passenger aircraft at an 
international airport. Aircraft arrive requiring this 
service at a mean rate of À per hour, and arrive 
independently of each other. They are serviced one 
at a time, with an exponential distribution of service 
time. The cost for each aircraft on the ground is put 
at c, per hour, and the cost of servicing the planes at 
a rate LL is c; Li per hour. Prove that the service rate 
that minimizes the total cost per hour is 


e die 


The machines in a factory break down in a Poisson 
pattern at an average rate of three per hour during 
the eight-hour working day. The company has two 
service options, each involving an exponential 
service time distribution. Option A would cost 
£20 per hour, and the mean repair time would 


66 


67 


be 15 min. Option B would cost £40 per hour, with 
a mean repair time of 12 min. If machine idle time 
is costed at £60 per hour, which option should be 
adopted? 


Ships arrive independently at a port at a mean 

rate of one every three hours. The time a ship 
occupies a berth for unloading and loading has an 
exponential distribution with a mean of 12 hours. 
Ifthe mean delay to ships waiting for berths is to be 
kept below six hours, how many berths should there 
be at the port? 


In a self-service store the arrival process is Poisson, 
with on average one customer arriving every 30 s. 
A single cashier can serve customers every 48 s on 
average, with an exponential distribution of service 
time. The store managers wish to minimize the 
mean waiting time for customers. To do this, they 
can either double the service rate by providing an 
additional server to pack the customer's goods 
(at a single cash desk) or else provide a second cash 
desk. Which option is preferable? 


WARE Bayes’ theorem and its applications 


To end this chapter, we return to the foundations of probability and inference. The 
definition of conditional probability is fundamental to the subject, and from it there 
follows the theorem of Bayes, which has far-reaching implications. 


11.11.1 Derivation and simple examples 


The definition in Section 11.2.1 of the conditional probability of an event B given that 
another event A occurs can be rewritten as 


P(A N B)= P(B|A)P(A) 
If A and B are interchanged then this becomes 

Р(А П В) = P(A | B)P(B) 
The left-hand sides are equal, so we can equate the right-hand sides and rearrange, 
giving 

P(A|B) = BD 4 


Now suppose that B is known to have occurred, and that this can only happen if one of 
the mutually exclusive events 


{A,...,4,}, 4,94=8 (izj) 


Theorem 11.2 


Example 11.36 


Solution 
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has also occurred, but which one is not known. The relevance of the various events 4; 
to the occurrence of B is expressed by the conditional probabilities P(B|A;). Suppose 
that the probabilities P(A;) are also known. The examples below will show that this is 
a common situation, and we should like to work out the conditional probabilities 


P(B|A)P(A;) 
P(B) 


To find the denominator, we sum from 1 to n: 


P(A4B) - 


Y PUB) =l= ¥ PBA) PCA) 


l 
P(B) 
The sum is equal to 1 by virtue of the assumption that B could not have occurred without 
one of the A; occurring. We therefore obtain a formula for P(B): 


P(B) - Y, PGIA)P(A) 


i=] 


which is sometimes called the rule of total probability. Hence we have the following 
theorem. 


Bayes’ theorem 


If {A,,..., A,} are mutually exclusive events, one of which must occur given that 
another event B occurs, then 
Р(АЛВ) = саас ЕЕ У) 
У, Р(ВІ4)Р(А)) 


Dl 


end of theorem 


Three machines produce similar car parts. Machine A produces 40% of the total output, 
and machines B and C produce 25% and 35% respectively. The proportions of the 
output from each machine that do not conform to the specification are 10% for A, 5% 
for B and 1% for C. What proportion of those parts that do not conform to the 
specification are produced by machine A? 


Let D represent the event that a particular part is defective. Then, by the rule of total 
probability, the overall proportion of defective parts is 


P(D) =P(D|A)P(A) + P(D|B)P(B) + P(D|C)P(C) 
= (0.1)(0.4) + (0.05)(0.25) + (0.01)(0.35) = 0.056 


Using Bayes’ theorem, 


P(A|D) = Pe A)P(A) _ (0.1)(0.4) 9 744 


P(D) 0.056 
so that machine A produces 71.4% of the defective parts. 
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Example 11.37 


Solution 


11.11.2 


Suppose that 0.196 of the people in a certain area have a disease D and that a mass 
screening test is used to detect cases. The test gives either a positive or a negative result 
for each person. Ideally, the test would always give a positive result for a person who 
has D, and would never do so for a person who has not. In practice the test gives a 
positive result with probability 99.996 for a person who has D, and with probability 
0.2% for a person who has not. What is the probability that a person for whom the test 
is positive actually has the disease? 


Let T represent the event that the test gives a positive result. Then the proportion of 
positives 1s 


P(T) = P(T|D)P(D) + PCT| D) PCD) 
= (0.999)(0.001) + (0.002)(0.999) ~ 0.003 


and the desired result is 


P(D|T) = PLIPYPCD) _ (0.999)(0.001) _ 1 
P(T) 0.003 ? 


Despite the high basic reliability of the test, only one-third of those people receiving a 
positive result actually have the disease. This is because of the low incidence of the 
disease in the population, which means that a positive result is twice as likely to be a 
false alarm as it is to be correct. 


In connection with Example 11.37, it might be wondered why the reliability of the 
test was quoted in the problem in terms of 


P(positive result|disease) and P(positive result|no disease) 
instead of the seemingly more useful 
P(disease | positive result) and  P(disease | negative result) 


The reason is that the latter figures are contaminated, in a sense, by the incidence of the 
disease in the population. The figures quoted for reliability are intrinsic to the test, and 
may be used anywhere the disease occurs, regardless of the level of incidence. 


Applications in probabilistic inference 


The scope for applications of Bayes' theorem can be widened considerably if we assume 
that the calculus of probability can be applied not just to events as subsets of a sample 
space but also to more general statements about the world. Events are essentially state- 
ments about facts that may be true on some occasions and false on others. Scientific 
theories and hypotheses are much deeper statements, which have great explanatory 
and predictive power, and which are not so much true or false as gaining or lacking in 
evidence. One way to assess the extent to which some evidence E supports a hypothesis 
Hiis in terms of the conditional probability P(H | E). The relative frequency interpretation 
of probability does not normally apply in this situation, so a subjective interpretation is 


Example 11.38 


Solution 


Example 11.39 


11.11 BAYES’ THEOREM AND ITS APPLICATIONS 989 


adopted. The quantity P(H | E) is regarded as a degree of belief in hypothesis H on the 
basis of evidence Æ. In an attempt to render the theory as objective as possible, the rules 
of probability are strictly applied, and an inference mechanism based on Bayes’ theorem 
is employed. 

Suppose that there are in fact two competing hypotheses H, and H,. Let X represent 
all background information and evidence relevant to the two hypotheses. The probabil- 
ities P(H, |.X) and P(H,|X (1 E) are called the prior and posterior probabilities of H,, 
where £E is a new piece of evidence. Similarly, there are prior and posterior probabilities 
of H,. Applying Bayes’ theorem to both H, and H, and cancelling the common denomin- 
ator P(E) gives 


P(H|XQE) P(E|H,O X) P(H, LX) 
P(H,|XNE) Р(Е|Н, ПХ) Р(Н,|Х) 


The left-hand side and the second factor on the right-hand side are called the posterior 
odds and prior odds respectively, favouring H, over H,. The first factor on the right- 
hand side is called the likelihood ratio, and it measures how much more likely it is that 
the evidence event E would occur if the hypothesis H, were true than if H, were true. 
The new evidence Æ therefore ‘updates’ the odds, and the process can be repeated as 
often as desired, provided that the likelihood ratios can be calculated. 


From experience it is known that when a particular type of single-board microcomputer 
fails, this is twice as likely to be caused by a short on the serial interface (H,) as by a 
faulty memory circuit (H,). The standard diagnostic test is to measure the voltage at a 
certain point on the board, and from experience it is also known that a drop in voltage 
there occurs nine times out of ten when the memory circuit is faulty but only once in 
six occasions of an interface short. How does the observed drop in voltage (E) affect 
the assessment of the cause of failure? 


The prior odds are two to one in favour of H,, and the likelihood ratio is (1/6)/(9/10), 
so the posterior odds are given by 


P(H | E) 


pag = 00) = 0370 


The evidence turns the odds around to about 2.7 to one in favour of H). 


An oil company is prospecting for oil in a certain area, and is conducting a series of 
seismic experiments. It is known from past experience that if oil is present in the rock 
strata below then there is on average one chance in three that a characteristic pattern 
will appear on the trace recorded by the seismic detector after a test. If oil is absent 
then the pattern can still appear, but is less likely, appearing only once in four tests on 
average. After 150 tests in the area the pattern has been seen on 48 occasions. Assuming 
prior odds of 3:1 against the presence of oil, find the updated odds. Also find the 90% 
confidence interval for the true probability of the pattern appearing after a test, and 
hence consider whether oil is present or not. 
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Solution 


Let H, and H, represent the hypotheses that oil is present and that it is absent respect- 
ively. There were effectively 150 pieces of evidence gathered, and the odds need to be 
multiplied by the likelihood ratio for each. Each time the pattern is present the likeli- 
hood ratio is 


P(pattern| H) _ı1/1_4 
P( pattern | H3) 


=3/473 
and each time it is absent the likelihood ratio is 


P(no pattern | H,) _ 2/3 _8 
P(no pattern | H;) с 


The updated odds, letting E represent the total evidence, become 


P(Hj|E) .,448,g 10,1 
—— — = (4) (2) (1) = 2.01 
PLIE) G GEG) 
The odds that there is oil present are therefore raised to 2:1 in favour. 

Confidence intervals for proportions were covered in Section 11.3.6. The proportion 
of tests for which the pattern was observed is 48/150 or 0.32, so the 90% confidence 
interval for the probability of appearance is 


0.32 4 TNI ESTEE | = (0.26, 0.38) 


The hypothesis that oil is absent is not compatible with this, because the pattern should 
then appear with probability 0.25, whereas the hypothesis that oil is present 1s fully 
compatible. 


For the problem in Example 11.38 it is conceivable that there could be enough 
repetitions for the relative frequency interpretation to be placed on the probabilities of 
the two hypotheses. In contrast, in Example 11.39 the probability of the presence or 
absence of oil is not well suited to a frequency interpretation, but the subjective inter- 
pretation is available. 

Example 11.39 also provides a contrast between the ‘Bayesian’ and ‘classical’ inference 
approaches. The classical confidence interval appears to lead to a definite result: Hj 
is true and H, is false. This definiteness is misleading, because it is possible (although 
not likely) that the opposite is the case, but the evidence supports one hypothesis more 
than the other. The Bayesian approach has the merit of indicating this relative support 
quantitatively. 

One area where Bayesian inference is very important is in decision support and 
expert systems. In classical decision theory Bayesian inference is used to update the 
probabilities of various possible outcomes of a decision, as further information becomes 
available. This allows an entire programme of decisions and their consequences to be 
planned (see D. V. Lindley, Making Decisions, 2nd edn. Wiley, London, 1985). Expert 
systems often involve a process of reasoning from evidence to hypothesis with a Bayesian 
treatment of uncertainty (see for example R. Forsyth, ed., Expert Systems, Principles 
and Case Studies. Chapman & Hall, London, 1984). 
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11.11.3 Exercises 


A telephone-based automated customer care system 79 
has three main menu options: 45% of customers 

choose option 1, 32% choose option 2, and 23% 

choose option 3. Of those who choose option 1, 

28% eventually get routed to a service agent, as do 

41% of those who choose option 2 and 16% of 

those who choose option 3. What is the overall 

proportion of customers who eventually get 

routed to a service agent? 


An explosion at a construction site could have 74 
occurred as a result of (a) static electricity, 

(b) malfunctioning of equipment, (c) carelessness 
or (d) sabotage. It is estimated that such an explosion 
would occur with probability 0.25 as a result of (a), 
0.20 as a result of (b), 0.40 as a result of (c) and 0.75 
as a result of (d). It is also judged that the prior 
probabilities of the four causes of the explosion 
are (a) 0.20, (b) 0.40, (c) 0.25, (d) 0.15. Find the 
posterior probabilities and hence the most likely 
cause of the explosion. 


Three marksmen (A, B and C) fire at a target. Their 
success rates at hitting the target are 60% for A, 75 
50% for B and 40% for C. If each marksman fires 

one shot at the target and two bullets hit it, then 

which is more probable: that C hit the target, or did 

not? 


An accident has occurred on a busy highway 
between city A, of 100 000 people, and city B, of 
200 000 people. It is known only that the victim 

is from one of the two cities and that his name is 
Smith. A check of the records reveals that 10% of 
city A’s population is named Smith and 5% of city 
B’s population has that name. The police want to 
know where to start looking for relatives of the 
victim. What is the probability that the victim is 76 
from city A? 

In a certain community, 8% of all adults over 50 

have diabetes. If a health service in this community 
correctly diagnoses 95% of all persons with 

diabetes as having the disease, and incorrectly 

diagnoses 2% of all persons without diabetes as 

having the disease, find the probabilities that 


(a) the community health service will diagnose an 
adult over 50 as having diabetes, 

(b) a person over 50 diagnosed by the health service 
as having diabetes actually has the disease. 


A stockbroker correctly identifies a stock as being a 
good one 6096 of the time and correctly identifies a 
stock as being a bad one 80% of the time. A stock 
has a 50% chance of being good. Find the 
probability that a stock is good if 


(a) the stockbroker identifies it as good, 
(b) kout of n stockbrokers of equal ability 
independently identify it as good. 


On a communications channel, one of three 
sequences of letters can be transmitted: AAAA, 
BBBB and CCCC, where the prior probabilities 
of the sequences are 0.3, 0.4 and 0.3 respectively. 
It is known from the noise in the channel that the 
probability of correct reception of a transmitted 
letter is 0.6, and the probability of incorrect 
reception of the other two letters is 0.2 for 

each. It is assumed that the letters are distorted 
independently of each other. Find the most 
probable transmitted sequence if ABCA is 
received. 


The number of accidents per day occurring at 

a road junction was recorded over a period of 
100 days. There were no accidents on 84 days, 
one accident on 12 days, and two accidents on 
four days. One hypothesis is that the number of 
accidents per day has a Poisson distribution with 
parameter A (unspecified), and another is that the 
distribution is binomial with parameters n = 3 
and p (unspecified). Use the average number of 
accidents per day to identify the unspecified 
parameters and compare the hypotheses assuming 
that the binomial is initially thought to be twice as 
likely as the Poisson. 


The following multinomial distribution is a 
generalization of the binomial distribution. Suppose 
that there are k distinct possible outcomes of an 
experiment, with probabilities p;, . . . , P and that 
the experiment is repeated n times. The probability 
of obtaining a number n, of occurrences of the first 
possible outcome, n, of the second, and so on up to 
n, of the kth is 
Pl.. 


' n n 
om) = —(p) sss Qu. 
. ny! 


n.. 


Suppose now that there are two competing 
hypotheses H, and H,. H, asserts that the 
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probabilities are p,,..., p, as above, and H, 
asserts that they are q,, . . . , q,. Prove that the 
logarithm of the likelihood ratio is 


PGi sc MAE e (2) 
bh | = nj in| — 
E 2: qi 


i=1 $ 


77 According to the design specification, of the 
components produced by a machine, 9296 should 
have no defect, 5% should have defect A alone, 
2% should have defect B alone and 1% should 
have both defects. Call this hypothesis H,. The 
user suspects that the machine is producing more 
components (say a proportion pg) with defect B 
alone, and also more components (say a proportion 
Pas) With both defects, but is satisfied that 5% have 
defect A alone. Call this hypothesis H,. Of a sample 
of 1000 components, 912 had no defects, 45 had 
A alone, 27 had B alone and 16 had both. Using 
the multinomial distribution (as in Exercise 76), 
maximize In P(912, 45, 27, 16|H;) with respect 


to pz and pap, and find the posterior odds assuming 
prior odds of 5:1 in favour of H,. 


78 Itis suggested that higher-priced cars are assembled 


with greater care than lower-priced cars. To 
investigate this, a large luxury model A and a 
compact hatchback B were compared for defects 
when they arrived at the dealer's showroom. All 
cars were manufactured by the same company. The 
numbers of defects for several of each model were 
recorded: 


A: {5, 4, 3, 5, 3, 4} 
B: {8, 6, 8, 9, 5} 


The number of defects in each car can be 

assumed to be governed by a Poisson distribution 
with parameter A. Compare the hypothesis H, that 
A, Z Àp with H, that À; = Àp = À, using the average 
numbers of defects to identify the 2 values and 
assuming no initial preference between the 
hypotheses. 


11.12 Review exercises (1-10) 


1  Eightcases each of 12 bottles of wine from a 
vineyard were tested for evidence of oxidation 
in the wine. Five of the cases were bottled using 
standard corks and, of these, six bottles were 
found to have oxidized. The remaining cases 
were bottled using plastic bungs and, of these, 
three bottles were found to have oxidized. Test 
the hypothesis that there is no difference in the 
proportion of bottles oxidized for the different 
types of cork. 


2  Theamplitude d of vibration ofa damped pendulum 
is expected to diminish by 
gogo 


Successive amplitudes are measured from a trace as 
follows: 


1.00 2.04 3.12 4.09 5.22 6.30 7.35 8.39 9.44 10.50 
du 2746891 75181:26220:94:220:008907791980:5290749 20:9 ]19 99:0 21] 





Find a 95% confidence interval for the damping 
coefficient A. 


3 Successive masses of 1 kg were hung from a 
wire, and the position of a mark at its lower 
end was measured as follows: 





Load/kg 0 1 2 3 4 5 6 7 
Роѕійоп/ст | 6.12 6.20 6.26 6.32 6.37 6.44 6.50 6.57 


It is expected that the extension Y is related to the 
force X by 


Y=LX/EA 


where L = 101.4 cm is the length, 

A= 1.62 x 10? cm? is the area and E is the 
Young's modulus of the material. Find a 95% 
confidence interval for the Young's modulus. 


4 The table in Figure 11.39 gives the intervals, in 
hours, between arrivals of cargo ships at a port 
during a period of six weeks. It is helpful to the 
port authorities to know whether the times of 
arrival are random or whether they show any 
regularity. Fit an exponential distribution to the 
data and test for goodness-of-fit. 


6.8 
21-3 
3.8 
12 
22 
1.8 
4.9 


2I 1.0 28.1 5.8 197 29 
9.1 6.9 5.6 2.0 22 10.2 
14.7 3.8 13:9 20 4.1 22] 
0.7 [Eo 1.8 0.7 0.4 72.0 
17.6 3.6 24 9. 0.4 4.4 
22.4 11.6 4.2 18.0 3.0 162 
6.8 10.7 0.9 24 3.8 9.0 


Figure 11.39 Time interval data for Review exercise 4. 


5 


When large amounts of data are processed, there 
is a danger of transcription errors occurring (for 
example, a decimal point in the wrong place), which 
could bias the results. One way to avoid this is to 
test for outliers in the data. Suppose that Xj, ... , 
X, are independent exponential random variables, 
each with a common parameter A. Let the random 


16.3 
6.5 
5.8 

10.7 

ell 
6.8 
8.8 


variable Y be the largest of these divided by the sum: 


ay: 


It can be shown (V. Barnett and T. Lewis, Outliers 
in Statistical Data. Wiley, Chichester, 1978) that 
the distribution function of Y is given by 


[y] 


њо) УС (а 
k-0 


(I«»«1) 


where [1/y] denotes the integer part of 1/y. For the 
data in the Review exercise 4 (Figure 11.39) test 
the largest value to see whether it is reasonable 

to expect such a value if the data truly have an 
exponential distribution. Find 95% confidence 
intervals for the mean inter-arrival time with this 
value respectively included and excluded from 

the data. 


Language courses in French, German and Spanish 
are offered by an adult learning institute. At the end 
of each course, the students are asked to grade 
their response to the course as either very satisfied, 
fairly satisfied, neutral, fairly dissatisfied, or very 
dissatisfied. After gathering data for several terms 
the results are as follows: 





Grade French German Spanish 
Very satisfied 16 6 22 
Fairly satisfied 63 13 76 
Neutral 40 27) 60 
Fairly dissatisfied 10 13 32 
Very dissatisfied В S 12 
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10.7 253 12,5 1.6 3.0 90) 15.9 
6.8 42.5 29 WO 3.1 249 1.0 
7.6 6.4 11.3 51.6 15.6 2.6 7.6 
8.3 IS 3.6 6.0 0.1 ЗЫ 129 
ПА 10.1 18.8 3.4 0.2 4.9 120 
3:7 13.6 1537 0.7 Du 18.8 29.8 
4.8 0.3 4.6 4.9 6.1 33.0 6.5 


Is there evidence of different levels of satisfaction 
with the different courses? 


A surgeon has to decide whether or not to 
perform an operation on a patient suspected of 
suffering from a rare disease. If the patient has 
the disease, he has a 50:50 chance of recovering 
after the operation but only a one in 20 chance of 
survival if the operation is not performed. On the 
other hand, there is a one in five chance that a 
patient who has not got the disease would die as 
a result of the operation. How will the decision 
depend upon the surgeon's assessment of the 
probability p that the patient has the disease? 
(Hint: Use P(B |A) = Р(В|А П С)Р(С) + 

P(B|A M C)P(C), where A and C are 
independent.) 


A factory contains 200 machines, each of which 
becomes misaligned on average every 200h of 
operation, the misalignments occurring at random 
and independently of each other and of other 
machines. To detect the misalignments, a quality 
control chart will be followed for each machine, 
based on one sample of output per machine per 
hour. Two options have been worked out: option 
A would cost £1 per hour per machine, whereas 
option B would cost £1.50 per hour per machine. 
The control charts differ in their average run 
lengths (ARLs) to a signal of action required. 
Option A (Shewhart) has an ARL of 20 for a 
misaligned machine, but will also generate false 
alarms with an ARL of 1000 for a well-adjusted 
machine. Option B (cusum) has an ARL of four 
for a misaligned machine and an ARL of 750 for 
a well-adjusted machine. 

When a control chart signals action required, 
the machine will be shut down and will join a 
queue of machines awaiting servicing. A single 
server will operate, with a mean service time 
of 30 min and standard deviation of 15 min, 
regardless of whether the machine was actually 
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misaligned. This is all that is known of the 
service time distribution, but use can be made 
of the Pollaczek-Khintchine formula, which 
applies to single-channel queues with arbitrary 
service distributions: 


, Qo)! * p 
2(1- p) 


(the notation is as in Section 11.10.3, with o; the 
standard deviation of service time). 10 
During the time that a machine is in the 
queue and being serviced, its lost production is 
costed at £200 per hour. In addition, if the 
machine is found to have been misaligned then 
its output for the previous several hours (given 
on average by the ARL) must be examined 
and if necessary rectified, at a cost of £10 per 
production hour. 
Find the total cost per hour for each option, 
and hence decide which control scheme should be 
implemented. 


Ns = p 


A transmission channel for binary data connects 

a source to a receiver. The source emits a 0 with 
probability a and a 1 with probability 1 — a, each 
symbol independent of every other. The noise in 
the channel causes some bits to be interpreted 
incorrectly. The probability that a bit will be 
inverted is p (whether a 0 or a 1, the channel is 
“symmetric’). 


(a) Using Bayes’ theorem, express the four 
probabilities that the source symbol is a 0 or a 
1 given that the received symbol is a 0 ora 1. 

(b) If p is small and the receiver chooses to 
deliver whichever source symbol is the more 
likely given the received symbol, find the 
conditions on & such that the source symbol 
is assumed to be the same as the received 
symbol. 


If discrete random variables X and Y can take 
possible values (,,..., u,) and (v, ..., v, 
respectively, with joint distribution P(uj, vj) 
(see Section 11.4.1), the mutual information 
between X and Y is defined as 


е P(u, v;) 

ПС P(uj, v;) log; —————————— 

( ) 22 (ик Э) D(X = иР(Ү =) 
== 
Show that for the binary symmetric transmission 
channel referred to in Review exercise 9, if X is 
the source symbol, Y the received symbol and 
= i then 


IQG Y)-21-plog;p *(1-—p)log;(1 — p) 


The interpretation of this quantity is that it 
measures (in ‘bits’) the average amount of 
information received for each bit of data 
transmitted. Show that /(X; Y) = 0 when p = A 
and that (X; Y) — 1 as p —^ 0 and as p 5 1. 
Interpret this result. 
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Exercises 





CHAPTER 1 12 -6,3,2;[2 1 1],[-1 1 17,f0 1 -17 
13 [i -1 of 
14 8.59, [0.61 0.71 0.35]" 


15 (а) 3.62, [0.62 1 1] 

ү yk 0 (b) 7, [0.25 0.5 1f 

_ = (с) 2.62,[1 -0.62 -0.62 1] 
2 A= J; -J4 0 
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© 9 I Ars» esp -1 0f 

The transformation rotates the e,, e, plane through 1/4 

about the e, axis. 
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1 (a) 


18 10.132, 4.491, 0.373 


3 (a), (c) and (d) 19 (b) 0.59 

4 The set of all odd quintic polynomials; it has dimension 3. 20 5,2,-1;[-1 5 3],[0 2 1]. [1 0 oy 
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(е) 14,7,-7;[2 6 3[,[6 3 5B 2 -6f 
(f 2L-L[ 1 5p 0 1p 2 77 
(в) 5,3,1; 12 3 -Ñg -1 of[o -1 1f 25[0 0 1f 
(h) 4,3,1;2 —1 AR -1 ор, 1 -2 


ом м 
© t м 


0 
0 
3 
7 (а) 5,[1 1 1]5 1 (repeated) with two linearly 
independent eigenvectors, e.g. [0 1 2], 
flo -1f 
(b) -1,[8 1 3]; 2 (repeated) with one linearly 
independent eigenvector, e.g. [1 —1  0]T 
(c) 1, [4 1 —3]5 2 (repeated) with one linearly 
independent engenvector, e.g. [3 1 -2]' 27 А=-2:[0 11 
(4) 2,[2_ 1 2]51 (repeated) with two linearly А=&[0 1-1 
independent eigenvectors, e.g. [0 2 —1]', 2 0 0 
[2 0 3f 
-2 0 
4 
0 


— Qc 


1 1 
26)=|0 1 
0 0 


— 


op o o ip 
017,16 -1 0 -6f 


0 


81,[-3 1 17 0 0 


0 
1 
92eg.f1 0 1],[0 1 1f 0 0 4 
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28 y2+y2+y3 


TO EXERCISES 


29 (a) Positive-definite 


(b) Positive-semidefinite 


30 (a) 2a> 1, 


(c) Indefinite 
(b) 27? < 6a -3 


31 Positive-semidefinite, eigenvalues 3, 3, 0 


32 k > 2; when k = 2 Q is positive-semidefinite 


33 а> 2 
34 д2 5 


36 (а) ; ; 
2 3 


ol a ð k 
5 7 12 17 


=2 551 


1| -I 1 
37 (a) i ; | -3 5 


47231 4734 


7 -1 -2 


2 47270 


38 | 47342 47195 47306 
47270 47306 47267 


39 (a) n " 
te e 


41 (а) Е 2 


4 2 
42 |1 +27 +1-4 
52+ 5 





-1 2t-1 


(b) б t | 
e -e e 


10 

w 90 
| (b | 

1 

2 


Р-Р+1-1 
51-5 


43 (a) 3,3 (b) Yes 





122 
3 1 
44 (а) 179 116/0 о Е E: 
1 | з з 3 
m ġo 310 0/5 1] 
3 3 3 
ure om 
b) — 
(D od 8 
10 -10 
p d$ 30 -17 6 -4 
14118 9 9 30 27 
46 (a) 1 
: 0 $0 i 
2 ү? 
О |5 50 о t 
2 %2 
3 40 0 0 





ЕГ -2 2 
© uc 2 2 


(9) х=, у= -1 
47 (а) х=у= 2 
| 4 5 1 6 
48 (b) 15 2 10 8 3 
-3 0 3 3 


49 (а) (1) х=у=1 (ii) x =y = 1.0909 
(b) (0 х=у=1 (i) x=y= 1.4785 
(с) () х=у=1 (Ш) х= у= 1.4998 
50 т = 0.5, с= 0.8 


0 1 0 0 


51 (a) Х= | 0 0 1|х+|0|и, 
-4 -5 -4 1 
у=[1 0 0]х 
0 1 00 0 
(b х= AB E. x+ ш и, 
0 0 0 | 0 
0 -4 -2 0 $ 





52 (а) *= | 0 0 1|х+|0|и, 
=7 -5 -6 1 
y=[5 3 1]х 
0 1 0 0 
(0) х= 0 0 1|х+)ои, у=[2 3 1х 
0 -3 -4 1 
E -R;/L; 1/1, UL, 
53 X- |-R/L, -(R,*RjJL, -ML,|X * | l/L, t. 
1/С 1/С 1/С 0 
y=[0 R, O]x 
54 A possible model is 
B(M, + М)/ММ, 1 0 0 
g2| (GM*KM,*KM)MM, 0 1 0|. 
-K,B/MM, 0 0 1 
| -K,K/MM, 0 0 0 
0 
J и, у=[1 0 0 0]х 
K,B/MM, 


K,KJ MM, 


(Ri + Ry + Ry) К, 





55 Xi _ ac, ac, X; 
X, R _К, + К; X2 
ac, ac, 
Ry +R, 
ac, 
u 
Rs 
ac, 
Ri _RitR; 
M B a a 
| | Кир урук) EB 
a a 
R; 
+ s и, 
DRR) 
a 


о = КЁ; + (Ку + К;)(К, + К) 
-2.6 x 107, -1.1 x 10? 


t 
56 e 0 
te! e 


=f = 
n ге | yce" 2 


t 


ZI e (1-2) 
58 [e e(r D] 
59 x =2-— 4e” +3”, x, =8e” -— 9e” 
60 x,—4te? c e?'—e*, x, 23e'-2e?' - Ate 
-5 +e" + Le" 
61 x(t) = ? : 
gt , 5 51 


3-ie vie 


62x(-e[3 2]-e?[1 -2]' 


14e°-4 д 


= -6 
7e” +80” 


63 el 


64 x(t) = e”™{ (cos 2t — sin 20[2  1]T 
— (cos 2t + sin 2)[0 1]"} 


20 0 : 
652-0 | 0|+{| oļu, 
00-1 -i 


3 
y-[l -4 2k 


1 =A 
T> %53 


66 œ% = 4, 4 = 
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676,1, [4 Wo -1]" 
x, =4e% —3e', x, =e" + 3e! 


68 Same as Exercise 61 
69 Asymptotically stable 
70 Asymptotically stable 
71 a>0,b>0 


1.10 Review exercises 


1 (a) 5,2,-1;[11 5 3]5[s 2 iJ5[ o op 
(03,2,45[1 2 1D i on o 1 
(о 3,1, 06 [1 2 50 o pp i 1f 


2655 1 m i efu or 


3 b=1,c=2;A=2, 4, 1; 
По -2 -1[,п 1 -1f 


4 54 


5 (а) 4.56; [0.72 0.84 117 (Ы) 1.75 
(c) (i) L19 (ii) 1.75 


6[0 -1 1f 
73,2, [2 1 15B 2 mss or 
IEE TE EE, 


$5 - 3 -15 {0 1 -2 15, 
[D o0 1 -F [0 0 0 15; 
C- Ce + ЗСе“ – ЗСе?' 


ü k 
10 (a) (i) | F | (ii) | : | 
-32 3 2*-1 1 


FA) 


1 (l-e 
-2t 


0 e 


(b) 


He-[ 0 Oer-[ 1 0],еї=[0 0 iT 


12 2, 2 € 42; 1:0:-1, ::-42:1, 1:42:1 


1 
2 


13 (a) Positive-semidefinite (b) Positive-definite 
(c) Indefinite (d) Negative-semidefinite 
(e) Negative-definite 


14 1;3,[1 1 O0]5-L[O0 -1 17 


08 06/|5 0 0 SE 
sof NI | os 0.6 0 


06 08/0 25 0 
-0.6 08 0 
_| 24 32 0.192 0.256 
bi = 
(b) ps| 18 24 0.144 0.192 
-20 15 -0.16 0.12 
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1 

16 (c) Зава 331 2 
; ? "| l8|-1 2-2 
3 
20 0 

17A=/0 1 O,b-[ 0 =F, 
00-1 


с= П -4 -2p 
The system is uncontrollable but observable; it is stable 


-2 1-2 
18 М=| о -4 
-1 0 -1 


2 -t 0127-24 _ 58,73! L1, 47 
14e Tue е + T 


x(t) = 7e! - 3g? +1 


3t 


Jet's 2577! 2 
Je pe 5 


19 (а) 6,[3 2 13,0 = opp 3$ ip 


i (3 +3t)e” +3 e" 
(0) 5) (1+ 30е «2e* 


3t 6t 
=¢ +e 


20 x, = cost - 2sint — 2 cos2t 


x, — 1cost- 2sint-* } cos 2t 





CHAPTER 2 


Exercises 
1 X(0.3) 2 0.985 05 
2 X(1.1) = 0.094 913 
3 X(1) = 1.1571 
4 Х(0.5) = 2.1250 


5 X(2) 22.811489, (2) = 2.819944, 
x(t) 2 24Q + t7)/\3 


6 X,(2) = 1.573065, X,(2) = 1.558541, 
x(t) = (1 +2Ind) 


7 X(1.5 = 2.241257, X,(1.5) = 2.206 232, 
x(t) Inx(t) — x(t) = t — 1.981 214 

8 (a) X(0.5) = 0.1238 (b) X(1.2) = 1.3740 

9 X(0.5) = 1.7460 


10 (a) X(0.5) 20.7948 — (b) X(1) 2 -1.3511 





14 Х(0.5) = 0.1353 


15 (а) (0.75) = 3.2345 
(b) X2) 2 2.2771 


16 (а) Х,,(2) = 2.242 408, Х, (2) = 2.613 104 
Richardson extrapolation estimates the error 
as 0.123 565 so a step less than 0.0064 should 
be used. 

(b) Xo2(2) = 2.788 158,  X,,(2) = 2.863 456 
Richardson extrapolation estimates the error 
as 0.025 099 so a step less than 0.014 should 
be used. 

(c) X,,02) 2 2.884046, Х,,(2) = 2.897402 
Richardson extrapolation estimates the error 
as 0.000 890 so a step less than 0.057 should 
be used. 

x(2) 2 2.898 51 to 5 dp 


17 Х(3) = 1.46647 


18 (a) dx/dt=v, х(0) = 1 

dv/dt 2 4xt — 6(x? - fv, w(0) 22 
(b) dx/dt2v, x(1)22 

dv/dt 2—4(x?— 1?) w1)20.5 
(c) dx/dt=v, x(0)20 

dv/dt = -sin v — 4x, v(0) =0 
(d) dx/dt=v, x(0)=1 

dv/dt 2w, w0)22 

dw/dt 2 e? -- x?t - 6e'v — tw, w(0) 20 
(e) dx/dt2v, x(1)21 

dv/dí2w, v(1)=0 

dw/dt 2 sint x? - tw, w(1)2 —2 
(f) dx/dt2v, x(2)20 

dv/dt= w, v(2)=0 

dw/dt = (xt + twf, w(2)=2 
(g) dx/dt=v, x(0)=0 

dv/dt=w, v(0)=0 

dw/dt=u, w(0)=4 

du/dt =Int-—x?-xw, u(0)=-3 
(h) dx/dt=v, x(0)=a 

dv/dt 2 w, v(0)=0 

dw/dt=u, w(0)-b 

du/dt = t? + 4t — 5 + (xf) -v — (v — 1), 

и(0) =0 


19 X(0.3) — 0.299 90 
20 X(0.3) — 0.299 64 
21 X(0.65) — —0.826 03 


22 X,,(1.6) = 1.220254, X,,(1.6) = 1.220055 
Richardson extrapolation estimates the error as 
0.000 013 so, to obtain an error less than 5 x 1077, 
a step less than 0.088 should be used. 


23 Х,1(2.2) = 2.923 35036, Х,,(2.2) = 2.925 41756 
Richardson extrapolation estimates the error as 
0.000 295 so, to obtain an error less than 5 x 1077, 
a step less than 0.0060 should be used. 


2.7 Review exercises 
1 X(0.5) = 1.548 860 
2 X(1.2) 2 0.524 465 


3 X,,(0.4) 2 1.125583, Х,,(0.4) = 1.142 763 
Richardson extrapolation estimates the error as 
0.017 180 so, to obtain an error less than 5 x 10^, 
a step less than 0.0146 should be used. 


4 Х,0(0.25) = 2.003 749, Х, (0.25) = 2.004452 
Richardson extrapolation estimates the error as 
0.000 703 so, to obtain an error less than 5 x 10°, 
a step less than 0.0178 should be used. 


5 X,(1.2) = 2.374 037, (1.2) = 2.374 148, 
X,(1.2) = 2.374 176 


6 X(1) = 5.194 323 accurate to 6dp. 


8 Х00(2) = 0.847 035, Xoo12s(2) = 0.844 066 
Richardson extrapolation estimates the error as 
0.002 969 so we have X(2) = 0.84. 


9 X(4) = 0.1458 (using step size 0.002) 
10 X(2.5) 2 —0.6532 (using step size 0.025) 


CHAPTER 3 


Exercises 
1 (a) Circles centre (0, 0), x? - y? 2 1 + ес 
(b) Straight lines through (-1, 0), у = (х + 1) tan C 
2 (a) Family of curves y? 2 4x(x - 1) - C 
(b) Family of curves y?’ = $ x°@? — 12) + C 
3 (а) 2-ху= С 
(b) xy 2 In(C 4 z) 
4 (a) (Asec(t + B), tan(t + B), Ce’), curves on 
hyperbolic cylinders (x/A)? — y? 21 
(b) Curves defined by the intersections of 
mutually orthogonal hyperbolic cylinders, 
x-y =c,x -z =k 
5 (a) R-yz-2x,f,-xz- 1, Ё=ху- 1, 
f —25 f 2; fs Y bh 0, f. X, Jz 0 
(b) fi = xyz, f, = х?2°, }; = 3х°ул°, }„ = 2уЎ, 
= 2xz),fu- 6xyz?, f, =0, fi. = 3x72, f, = 6x7 yz 
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(c) f.=—yzl(x? ty .f, = 2х/(х? + у?), 
f; - tan ! (y/X), f — 2xyz/(x? + y F, 
fy 7 z0? - x) v yy, fom уо? +у?), у. = 0, 
Ў = 2xyzl(? + y’ F, fa =x? +y?) 
1 
( - 1) 
(b) te "(cos 2t — sin21) + $e sin 2t 


6 (а) 62207: – 1) +87 + 





9f Of. ос Of sind cos0 , of cos ó 
T pu eun r дф зїп Ө 


of | of eg. Sf sine 
m p 5 дө у 
8 Alr- B 
1 
2 x+y 


14 +1 


13 е2“ 





15 9, = и+ 2% 


17 20 +U -4wv) 2[ 40 - 4v] 
ul1—-4dw) O oA) ' 
v u 
JG 4) (1 – 4и) 
(Б) х2у? + уѕіпЗх + с 
(9) 2х – Зху+ 4у + с 


18 (а) ху? + х2у+х+с 
(c) Not exact 


19 -1, ysinx - xcosy * 1(y?- 1) 


20 m=2 
8x° + 36x*y + 62x3y? + 63х?у* + 54ху* + 27у* + с 
21 (36,9,12) (a) -¥ (b) 39, 4 (12, 3, 4) 


22 (a) (2x, 2y, -1) 
(b) (-yz/(x? + y’), xz/(x? + y’), tan у/х)) 


Ө зуу 
3 


(a9 = ix =x? = y? y, xX? +y?) 

(d) (yzsin n(x + y + z) + nxyzcos n(x +y + z), 
xz sin T(x + y + Z) + nxyz cos T(x + y + Z), 

xy sin n(x + y + z) + Wxyzcos M(x + y +z)) 








23 3 
24 (5i 4j - 3k)/|50 


25 (a) r/r (b) —rir? 





26 ф= ху +22 + 2у 


27 -9j + 3k, * 
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28 
29 


31 
32 
33 
35 
38 
40 
42 
43 
44 
47 


56 
57 
58 
59 
60 
61 


54?25' 

(a) х+2у+3®=6б,х—-1=;(у—1)= 1(-1) 

(b) 2x - 2y - 3z 2 3, 
1e-021o-2210-2 

(c) 2x * 4y -z 2 6, lx-1-2i(y-2)24-z 


(a) 6xy (b) 4 
—61 

a, a, 3a 

=13 

(y, 6xz — 1, 0) 


X? +y’ +z’ + xyz 


a-22,b22,c23; à 2 2x?y * 2zx * 3zy + const 


J11 rads“ 

d--a,c-b 

a) 2y?z) - 2x?z? - 6x?y?z 
(a) 2) » 


(b) 2y(1 + zji  2(x * xz - 2j - 2y(x - Dk 
(c) 2yzi * 2(x — z)j * 2yxk 


156 
—1 
3 
16 
3 
(2) 95 (6) 10 (c) 8 
10.5 + 4л 
(a) 16 — (b) 16 


(c) Not necessarily. The value has to be the same for 
all possible paths. 


62 35 

63 -2i-2j Ik 

64 4n(7i + 37) 

65 (a) 24 (b) 76 (c) 16 

66 #12 

67 2 

68 (a) (In2)tan (1) (b) i (c) 1 
69 -8/(31) 

70 (а) 102-1) (Ы) [(1 - 1)? 113k? 
71 202-1) 

73 1 

74 24(1- 1л) 





75 
77 
78 


79 < 


80 
81 


1 (бл — 20) 
inc2/m-1 
0 

11 

0 


Sa(1— in) 


83 2л 


84 
85 
87 
88 
90 
91 
92 
94 
95 
96 
97 
98 
99 
100 
101 
102 
103 
104 
105 
109 
110 


а)" (тт 
(а) a 

3 

3 


2 (b) 0 
(а) Вл (b) Әп 
247 


90 
0 


(a) * 


37 
(с) ал 


448 
(b) > 
ied 
50-2 
al 
720 


50 400 


(1—е!)/6 


24 5. S il 
15° Ge 16? i) 


3.7 Review exercises 


2 
7 
8 
9 


sin(x + 3y) 
ty -yxth?-ty +e 
а); (54 


3 
6 
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10 12у 
10 
6 
12 Вс 
3 
13 244 | 1g - sin (£) The AC unn" () 
3 2 a 3 3a a 


l= (e-e) 
14 ќа 
15 nq;r^l/4EI 
16 
17 
19 0 
20 Ë 


240 


© = 


СНАРТЕК 4 


Exercises 


3 
6 Semi-infinite strip v > 0, |u| < 1 
7 


(а) и= 13 – 4 
(b) v2 —u3 
(c) (u-- 1? - (v— 3? 24 
(9) и2 +12 = 8 
8 (а) 0= 1(-2+]), В = (1+2) 
(Ы) и+20 < 3 
(с) (5и – 3) + (50 – 6) < 20 
(d) 301 +3)) 
9 Interior of circle, centre (0, —1/2c), radius 1/2c; 


half-plane v « 0; region outside the circle, centre 
(0, —1/2c), radius 1/2c 


10 Circle, centre (1, —2), radius 7 
11 Re(w) = 1/2a, half-plane Re(w) > 1/2a 


12 wots! 





iz -j > 
Re(z) 7 const(K) to circles 


2 
“+ (ь- = 1 5 plus v=—1 (k= 1) 
1-k (1 — &) 





l 
2 


2 
Im(z) = const(/) to circles ( + J + (0 +1) = 
І 


plus u = 0 (1 = 0) 





13 


14 


15 


16 


17 


18 


19 


20 


21 


23 


24 


25 


26 
27 
28 
29 


30 


31 


(a) 1 4j, j, ve 
(Б) |w] > 42 
(с) 0=0, (и= 12 +102 = 1 
(d) +2!#е!% 


Segment of the imaginary axis |v| => 1 


(a) Upper segment of the circle, centre (2, 3 ), radius 
15, cut off by the line uw — 3v — 1 


Circle, centre G, 0), radius t 


z;7j, 07m 


ws $4 


Iw-1]<1; : 





0 2 — Zo i 
ge n where 0, is any real number 
202 — 1 


Region enclosed between the inverted parabola 
v — 2 — (и2/8) and the real axis 


и= 0, 2ти = (1 – т?)о 


D ,>=у--у——у} = 0; ейїрзев, 
x +y x +y 


u’ +v’ =r° and x? +y’ =r°,r large 





u=x+ 


(a) e(z+1) (b) 4 cos 4z (c) not analytic 
(d) —2 sin 2z 
a=-l1,b=1 


w 2 z? « jz, dwldz 2 2(1 4 j)z 
v=2y+ x^ -y 

e'(xsiny + y cosy), ze 

cosx sinh y, sinz 


(a) X -6cy -y-p 
(b) 2e™ siny +x? -y =B 
(а) (x? —y’)cos 2x — 2xy sin 2y 
+ j[2xy cos 2x + (x? — y’) sin 2y] 
(b) sin 2x cosh 2y + j cos 2x sinh 2y 
u = cos (2y?(x? & y? - 1 [G2 9 y? - 1 +4y7]}} 
y= sinh (1G? y? - D) e NEG? € y^ - 1 + 471} 


33 (a) 0 

(6) 3,4 

(e) 3« 3071-543, 3-1 — jy) 
34 z- tj 
35 (a) region outside unit circle 


b) 1І= 2+1? < е, 0 <и utan] 
(c) outside unit circle, u and v of opposite sign 
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36 УА whole w plane Ü 
Г ГЕ 
ГГ 
jr n a] 
18111 > > 
| | 2n х u 
x-k- hyperbola 
y = k > ellipse 
з. ый 


37 4a, ellipse centred at origin, semi axes are ERR 


and uel 


(b) 
(c) 


39 (a) 1 222+ 32 – 426 +... 
(b) 1 322 + 62 – 102° +... 








40 (а) 1—12 – 1) +102 – 1) – (2 – 1); 2 

(b 1 (2 - 2) + Ale - 2j)° - Be -2j)° 2 

(е) 1) +01) (2 1 -j) +3(2-1-j) 
+(1-1)(@-1-])°; 2 


41 1-2+ +... 





42 1, 1, (5; fis singularatz = j 


1,2, 2,5 Vd 
43 242 #2 Poesian 


44 (а) le2-3z Az e... (0 « |z| € D 
[2 


1 1 


(2-1? z-1 
(0 « [z- 1| «& 1) 


(b) +1- (2-0) +(2- 1)2-... 








45 (a) ...+— 
(b) z-— t —-... 
(c) asin  zf'(a) - z f"(a)4.... 


46 (a) ied + 12 + Be... 


(b) нает араса аг 
2 Z 2 4 8 
(c) оа 
Z 2 zz 7 
1 2 2 


d) — + —— + 
p (2-1) (z-1y 





Sues 








(е) edidere e= 
z-2 


-t(z-2Yy-... 


47 (a) z 2 0, double pole 

(b) z=j, simple pole; z = —j, double pole 

(c) z 2 £1, £j, simple poles 

(d) 2 = јит (n an integer), simple poles 

(е) z 2 jr, simple poles 

(f) z « 1, essential singularity 

(g) Simple zero at z= 1 and simple poles at z = +j 

(h) Simple zero at z =—j, simple pole at z= 3 anda 
pole of order 3 atz 2 —2 

(i) Simple poles at z= 2+), 2 — j and a pole of order 


2atz-0 
2 2 z 
48 (a) ~r ... (removable singularity) 
3 5 7 
(b) 441424242424... (pole of order 3) 
z z 2! 3! 4! 5! 
(c) Е + + + L -... (essential singularity) 
z 2lz Alz 


(d) tan 24 2 - 82 +... (analytic point) 


50 (a) Simple poles at = —1, 2; residues }, 3 


(b) Simple pole at z = 1, double pole at z = 0; residues 
-1,1 

(c) ир poles at z — 1, 3j, -3j; residues +, 

АСЕ i 

503-1), 5(3+)) 

(d) Simple poles at z = 0, 2j, —2j; residues –1, 

3,31 3 3. 

—8 tà» 8 — 4] 

(e) Pole of order 5 at z= 1, residue 19 

(f) Pole of order 2 at z= 1, residue 4 

(g) Simple pole at z 2 —3, double pole at z = 1; 

Li 


residues —;, ; 


(h) Simple poles at z = 0, 2, -1; residues 2, —2, 1 


51 (а) 1 (simple pole) 
(b) —4(3 +jy3) sin[$(1 +jy3)] (simple pole) 
(c) (1+])у2 (simple pole) 
(d) -n (simple pole) (e) -jį (double pole) 
52 (a) -} (triple pole) (b) -# (double pole) 
(c) e" (double pole) 


53 —* - j?, all cases 
54 0, all cases 
56 (a) 0, — (b) 2rj 
57 {т}, л) 


58 ®л(9+]2),0 


59 (a) -inj  (b)O 


60 (a) 0 (b) 2nj 


61 (a) —3nj (b) 2j 


62 = =j, ij ; x Е =j; 2] ‚2 = }\6, ze 46; 
Z=—j\6, —15) ү6 
(а) 0, (Dix, (90 


6 (90 (0 


64 (a) 2nj, 2nj 
(b) £n(25 - j39) 
(с) 0, nj, -nj 
(d) 0 - 2л), –3лј 
65 (а) 2л// (b) in (c) sgt 
() in. (f n (gm 
© m () 10-35) 


66 2axVy/(X 4 y?) 


(d) jn 
(h) 7/242 


67 (a) (0, 0), (0, 1), (0, 7), (7, 0) 
(b) v=0 (с) и=0 

68 H(x, y) -2y - y! * x^ 
= 22 – јг? 

70 (a) (0, 0), (1, 0), (-1, 0) 
(b) u=0 (c) v20 


4.9 Review exercises 


1 (а) 3ј ф®7+4 (01 (07 


2 (a) y 22x gives3u cv 23,u* 2v 2 3and3v -u- 1 
respectively 
(b) x+y=1 givesv=1,v—u=3 and u= 1 respectively 
3 (а) a= -3(3+j4), B=3+4j 
(b) 13 € 3u * 4v 
()|w-3-jls1 (à j7-p 


(b) u 23v 
(9) 4(02 +12) = и 


4 (а) и +0 +и-0= 0 
(с) и +12 +и-– 20= 0 





5 Left hand Right hand 


v 






y z plane 


a 


МГУ 
А) 
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OUR 1 
х= к э (и--#—) +v = > 
k-1 (k-1) 
2 
у=190-17+(0+1) = 1 
І 


Fixed points: 1 + 2 


6 Fixed points z = +/2/2 
r=l1>u=0 


7 и= х? – Зху?, о = Зх?у – у? 


8 (zsinz) v 2 ysinxcosh y + x cos x sinh y 


9 у= 1/2 
2 2 
10 Ellipse is given by —— + —+— = 1 
(А +а /4к) (R-a/4k) 
11 1-242 -24z5-...; 


1 — 223 +32°— 429+... 


12 (а) 1- 22+ 222 – 223; 1 
(b 1- Қа- 0) + Ца - 1) - Ца - 1)“; {2 
(с) i13) *G-D-I056G-i -1G-);:2 
13 1,1, 1, 24 , 242 respectively 
1 з 5 
14 (a --2+2 -2 +...0 < |2|< 1 
2 
(Ы) 1- (2- 1) + (2- 1) +... (12-11< 1) 


15 (a) Taylor series 
(b) and (c) are essential singularities, the principal 
parts are infinite 


16 (а) i(e" cos2y - 1) & j1e" sin2y 
(b) cos 2x cosh 2y — j sin 2x sinh 2y 


(c) 
xsinxcoshy -* ycosxsinh y 4 j(xcosxsinhy - ysinx coshy) 


x+y 
tanx(1 - tanh^y) t jtanh y(1 tan^x) 
(d) 


1 + tan^x tanh’y 
17 (a) Conformal (b) j,-1—j (c) +0.465, +j0.465 
18 


whole w plane 





2 2 








x =k — hyperbolas, 
cos k sin'k 
2 2 
y =1 > ellipses, —— + P = 
соз sinh7 





3 (a) 
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19 (a) Simple pole at z=0 


(b) Double poles at 2 = 2, 2e7"?, 2e4 

(c) Simple poles at z= +1, +j, removable singularity 
at z=—-1 

(d) Simple poles at z = }(2n + 1)л) 
(n 2 0, £1, £2,...) 

(e) No singularities in finite plane (entire) 

(f) Essential singularity at z= 0 

(g) Essential (non-isolated) singularity at z = 0 


20 (а) 20?  ()0 (90 (@0 
21 Zeros: +1, —} + ij /11 


* 1/4 3nj/4 5nj/4 7nj/4 
Ро1еѕ: 0, еч", е7", е" ет 
64-342 





Residues (respectively) —5, TED sq 
6-3.2,; 6-32 ; 6*342,, 
4 4 4 
22 —204 — 324j 
23 (a) -inj (6) 0 (е) (000, (1) Злј (d) 0,0 
4 
(п (0) 2-95 
lix 197 
24 (а) Zn (b) 172 (с) 754 (d) 12 
CHAPTER 5 
Exercises 
1 @ - ‚Ке(»)>2 0) 5 S Re(s) > 0 
=й. 
(с) ied Re(s) ? -1 








,Re(s) > 0 (d) ? 
sg a 1) 


2 (а) 5 (b) 3 (00 (3 (92 


(070  (g0 00 G2 (1) 3 
55-3 
2 





42 6 
(b) —- 
зз +9 


(c) 3822445 Re(s) > 0 
5 5 +4 


,Re(s) 3 





,Re(s) » 0 


(d) 





s -9 





(e) , Re(s) > 2 
5-4 


(f) dad. 2s 


5+2 s +4 
5» Re(s) > —2 





,Re(s) > 0 





(g) 
(s +2) 





aA 
52+ 65+ 13 
(i) —44, Re(s) > -4 
(5+4) 
36- 65+ 452-25) 
4 


(h) ,Re(s) > -3 


Q) ,Re(s) > 0 
2:115 

52+9 
p 4 
(s : 4) 
185? - 54 
($49) 
2 3s 
(n) 3-5 


5 5416 

2, Stl 43 Reis) > 0 
(s+2) 5 +25+5 s 
(a) He” _ e) 


(cy) $-it-te™ 


(k) ,Re(s) > 0 





(D ,Re(s) > 0 


(m ,Re(s) > 0 


— 





,Re(s) > 0 





(0) 


(b) -e* 4 2e* 
9731-9 (9) 2 соѕ 27+ 3 510 2/ 

(е) a (4t- ѕіп47) (f) e?'(cos t - 6sin f) 

(в) id- e” cos 2t+3 e” sin 2t) 

(h) е-е + 27е" 

(i) е(соѕ21+ 3 іп2) (]) 1е'- Зе" + 1e" 

(k) –2е7°' + 2 сов((21) – {1 ѕіп({21) 

(D ie'-ie'(cost-3sin:) 


(m) e "(cos 2f — sin 2f) 
-2t 


(n) 1e! - 2e 3e 
(о) -е'+3е”- le 
(p) 4-2cost-*1cos3t 

(q) 9e? - e [7cos(131) - J3 їп (1 ү3)] 


(0) }е'-е "Le" (cos3t-- 3 sin3f) 


(a) x(f) 2e? e^t 

(b) x() 23 e"? - 2 (cos 2t & 3 sin 27) 

(c) x(t)= " - e” cos2t- 1e 'sin21) 

(d) y(t) =4(12e”+30te”- 12 cos2t+ 16 sin2t) 
(e) n= ttet Et 

(f) x()) 2e RUD 

(в) x()-Be'-1 gu te (cos 2t - 3sin2f) 
(h) Е й e "cost (20) + J} sin(421)] 


: 1:3 12 “Ot 12 
() x(t) 2 (18i e? '+ге diei 


Q) phd te tes ee Bain 

(k) x(r) 2 te * - 1cos4t 

(D »()-e'-2te?'5 

ОАЕ «не hue ede 

(n) x(f) 2 Ze" - Zcos t - Ssint - 00537 


= & sin 3f 


6 (a) x()- ie" - 4e - e), y(t) = 1Ge" - e) 
(b) x(f) 2 5sint Soa —e%-3 
y(t) = 2е'– 5 ѕзіпі+ е – 3 
(c) x(f) 23sint - 2cost c e? 
y(t) =-3 sint -- 2 cost - 1e" 
(d) x(t)= е" -Ie, у() = –1 tietie” 
(e) x(f) 2 2e' * sint — 2cost 
y(f) 2 cost — 2sint — 2e' 
(f) x(f) 2 3 - e 3e? 
y(t) 2 t- 1- Ie' 43e 
(g) x) =2t-e' te, y(t) 2 t-1433e «1e 
(h) x(t) = 3 cost + сов(/3ї) 
y(t) = 3 cost — cos( 43t) 
(i) x(t) = соѕ( үф) + 2 соѕ( 67) 
y(t) 2 $cos( 455!) – 1 соѕ( 61) 
(j) x(t) =$e'+2 cos2r+}sin2r 


-t/3 


2t 


y(t) = ёе - 2 cos2t - 1 sin2t 


7 L()- E,(50-4 s)s 
(s? 105) (s - 100)? 
Es? 
D(s)- ————— —— 5 
(s? 105) (s - 100)? 





i(t) = Е(—1;е I" 1t e + 1 cos 1007) 


9 i,(r) 2204 e" sin(1/7f) 


10 x,(4)= —+ cos( 31) - п cos( /131) 
xY(f) 2 -:5cos( 31) +i t cos( 4134) , 9, 413 
13 f(t) = tH(t) — tH(t — 1) 
14 (a) f(f) 2 3? — [3(t — 4 4 22(t — 4) - 43]H(t — 4) 
- [2(t - 6) - 4]H(t — 6) 


ко)=5-(6+2.8)6 (2.3) 
S 


S S S S S 
(b) f(0-t-2(- DH - 1) (t- 2) 2) 
F(s)=4-2e%+4 0% 
8 S 


15 (a) H(t- 5) e"? H(t- 5) 
(b) 3e ^? - e? ^? pr - 2) 
(c) [t — cos(t — 1) — sin(t — 1)]A(t — 1) 
(d) £e * "^ (I3 cos[1/3(t - n)] 
+ sin[}/3(¢- 1) ]}H(t- T) 
(e) H(t- $n) cos5t 
(f) [t — cos(t — 1) — sin(t — 1)]A(t — 1) 


16 x(!) 2 e* 4 (t- D[1 — (t — 1] 
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17 x(2) 2 2e "^ cos(1/31) - t- 1 - 2H(t - 1) 
(£- 24 e ^? tcos[L3(r - 1)] 
- J$ sin[} /3(t - 1)]}} 
+H(t-2){t-3+e0” 
{cos[ 5 /3(t- 2)] - y; sin[; /3(t - 2)]}} 
2t 


18 x(t)= ee a(sint - 3cos t+4e"e 
SU 


= e^)H(t- $n) 
19 f() 23 - 2(t - 4)H( — 4) 
F()-342.* 
S S 


x(t) =3 — 2 cost + 2[t — 4 — sin (t — 4)]H(t — 4) 





20 0 (t) = (1 - e™ cost- 3e™ sint) 
- &[1 - e'* e?'cos(r- a) 
За -31 


Es e" sin(f - a)|H(t- a) 


21 @(t)=5(3 - 2t - 3e " - 10re ) 
+ 5021-3 + (2t - 1) e“ A(t 1) 


3-3eP-6se ^ 
s'(1- e) 


25 (a) 20(t) - 9e? — 19e 

(b) ó(r) - 3sin2t 

(c) ó(1) - e (2cos2t * 1 sin21) 
26 (a) x() 2-267 41 R 
+(e enge ?R( - 2) 
(b) x(t)=} Hon. - 21)sin2t 
(c) x(t) 25e? — 4e * + (e — e €) p: — 3) 


27 (a) f'(t) 2 g'(t) - 43ó(t — 4) — Aó(t — 6) 





6t (0=7< 4) 
80) =12 (4<1<6) 
0 (126) 
1 (0ж=т<1) 
(D g(0-1-1 (1 <t< 2) 
0 (re 2) 
(с) (0) » g'(t) - 5é(t) — 68(t — 2) 4 156(t — 4) 
2 (0 <t<2) 
g’(t)=4-3 (2<1t<4) 
21-1 (2 4) 
28 x(t)=-Be + Bette” 
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bs 2; 1 R R 
sinnt, n = — - —, =~ 


Eg 
n LC 4° 2L 


30 q(t)= 
E 


i(t) = e "(n cos nt - u sin nt) 
Ln 


1 4 3 n 
31 у(х) = ВЕТ ^^ 0+8Й(х- 1) H(x - H) 


—4(M + W)x3 + (2M + 3W)Px] 


wGQi-xi)x wa- xi) 





Spe ET 6EI 
+ izle - x0 HG -x) 


- (& - xj Hx - xj] 
Ymax = W1*/SET 


33 у(х) = zi lx! - lx - b) Hx - b) - lox ] 


Wx? 3b = 
GET! -x) (0<x<)b) 


2 
i x- b) (bex«l) 


352 


5 +25+ 5 
(b) 52+ 25 + 5 = 0), order 2 


ty, 2 
(c) Poles —1 +j2; zero —} 


34 (a) 


s’ +5s+6 
5+ 552 + 175+ 13 
order 3, zeros —3, —2, poles —1, —2 + j3 


36 (a) Marginally stable (b) Unstable 
(c) Stable (d) Stable (e) Unstable 


37 (a) Unstable 
(b) Stable 
(c) Marginally stable 
(d) Stable 
(e) Stable 


35 ‚52 + 552 + 175 + 13 = 0 


40 К>? 
41 (a) 3e" -3e* (b) ie "sin3t 
(c) Xe" - e?^) (d) Le"sin3t 


2 5+8 
(s+ 1)(s+2)(s+4) 


49 (а) L[2- e^ (98 6t 2)] 
(b) de" (5r 2) e e"(5r - 2)] 
(c) L(4t- 1 e ^) 





51 е-е“ 
x() 2 LA[1L-4e7 € 3e" -(1- 4e? 0D 
+3e" At - T)] 


52 e"'sint, [1 - e” 


53 К = E | a + AG у= 1 a 
X) 3 -1|»* 5 х2 


12s + 59 
(s+2)(s+4) 


«е 4" 9 


0 1 0 0 
55 (а) х= 0 0 1)/*+/0]4, 


(cost * 2 sin f)] 


G(s) = 


-7 -5 -6 1 
y=[5 3 1]х 
0 1 0 0 
(0) х= 0 0o i1x*[ow »-[2 3 Ix 
0 -3 «4 1 


-—5+фе'+10е° 
57 x(t) ; d 
3-е +ѓе 


58 x, 2x, 22e?! - e?! 





-Í -2t 
59 x(t) = 4te +e 
-4te'!-2e ? 4 2e* 
- © 0 i 
60 2= | 0 -2 Omz-*-1|w у=[2 9 2k 
0 0 -3 l 


2 


The system is stable, controllable and observable 
0 0 0 i 
612-0 -1 Omz-t|-iu у= [5 3 15] 
0 0 -5 i 


The system is marginally stable, controllable and 
observable 


62 8 — 24+ 2e%-6e" 


Х| o ı o olaf lo o0 
aet -r 0 Це о 

X, о о о 111 lo 0|" 

ža 0 1-1 -1||х| |0 1 


ANSWERS TO EXERCISES 1007 


X1 
5|. 1000]» 
У 0 0 1 0|% 
X4 
2 
G(s) = 1 - s +s+l ny 
(5+1) (5 +1) gs ssl 
X, -] -1 -1}/*1 -1 1 | 
64 (а) |х = |-1 -3 -3||х\+[-1 1 A 
| lat. = 8m [D 15- 


{ 


(b) G(s) = | 


0.2 2|» 
0 0 Ix 
25(25+3) 25(25+ | 
2 2 
-s Ё 
^ 2 s «10s? 4 1654 6 
(c) yÒ = 1 + 0.5786!” — 1.8246% + 0.24667!” 
у = 0.177е7*12 4. 0.2726: 959 — 0,4496713% 


65 un)-[-2 -Z]x0-*u 


ext 


66 u=} -F]x() +u 


ext 


67 u(t) = [8-2 J x(t) + tow 
u(t) = [-31 -11]x(?) * ug 


69 M= as , rank 1, M= 0 
1 -1 1 





1 
Е 


Rie 


5.10 Review exercises 


1 (a) x(t) =cost+sint— e (cos t+3sin f) 
(b) x()) 2-342 B e+ 5 eth 


2 (a) e'- lg? ' - ie (cos t sin Г) 


(b) i(f) 22e'-2e* 


t V[e'- i Ps Le “(cos t+ sin t)] 
3 x(t) 2 ot 5sint — 2sin2ft, 


y(t) = 1 — 2 cost + cos 2t 
4 1(соѕ/ +2 510) 
is - Dcost-t (xi * xo - 2) віп г] 
v1, 63.4? lag 
s cos Ó - O sin 


6 (a) (i) 9.0. 
8 


л 5 50 ф + 0 (соѕф + ѕіп 
(і) 551204 excos 9r sin ) 


52 +205 +20) 








(b) a; (cos 2t - 2sin 21) - e "(39 cos 21 4- 47 sin 2f) 


7 (a) e (cos 3t — 2 sin 32) 
(b) p(t) =2+2sint—-5e* 


8 x(t) =e +sint, y(t) =e — cost 


9 q(n-a(5e -2е 7) 
— a; (3 cos 1007 - sin 1007), 
current leads by approximately 18.5? 


-100¢ -200t. 


Fa -t/5 -2t 
10 x()22e - 15 e" + Те 


= az(76 cos 2t - 48sin 21) 


11 (a) 02 ,L(4e * + 10ге“ – 4со5 27+ 3іп 27) 
(b) i, ie" 6675, 52 ie" - e^) 


12 i-5p -e "(cos nt + sin nt)] 


-Rt/L -3Rt/L 


13 p=E4 3e -e  ) i AR 
6R 
14 x,(f) 2 Isint - 2sin2t + J3 sin(434)] 
x(f) 2 H[sint - sin21 - /3 sin(/32)] 
15 (a) (i) e'(cos3t * sin37) 
(ii) e’ — e* + 2te’ 
(b) y( 2 ie (84 121 P) 
16 (a) 3e"sin2r 
ni 
Ks(s? - 2Ks 4 n) 


-Kt Kt 


Jet oc qe 
K 
17 (a) (ii) e ^ "[cos2(t- o) - isin2(t- o)]H(t - æ) 
(b) y(t) = [е (cos2t - Isin2r) * 2sin f - cost] 
-@-т) 
+ ale б (соѕ 27 – 1 5іп 21) + cost 
— 2sin t] H(t — n) 


18 i(t) = zs[e ^ -2H(1- in 67490-10) 
+ 2H(t e T) е740-Т) 
3 -40(1-3T/2) 
c2H-$T)e 7 $00] | 
Yes, since time constant is large compared with T 


19 e*sinf, i[1- e (cos! sint)] 


4 
20 ЕІФУ = 12 + 12Н(х- 4) - Ró(x - 4), 
dx 


(0) = у'(0) = у(4) = у®(5) = у®(0) = 0 
lxt -425x +9x (0<х=<4) 
3097431 4 3 2,1 4 3 
1x4 — 4.25x° + 9x? + (x - 4)* - 7.75(x - 4) 
(4<x<5) 





25.5 kN, 18kN m 
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21 (a) f(r) - H(t - 1) - Het - 2) 
x() 2 H(t- D) - e) - H(t- 2)0 — e? 
(b) 0, E/R 
23 (a) t-2 * (t- 2)e* 


(6) у=1+2 – 2е'+ 21е, уг) = 12+; 





3 


24 Ely 2 Wl? & Wy? - EG - 2 His -1) 


4 
EI r =-Wô(x- 1) - w[H(x) - H(x - 1)] 
X 
25 (a) x(r) - 
LOL e 9? [3 sin(1437) - cos(1430) ]H(t - a)] 
1 


26 (а) Мо 0 a 
s +2s+(K- 3) 


(d) K>3 


27 (a) 4 (Ы) + 


28 (c) 4e? - 3e? y(r) 2 1 (t 9 0) 


a 
29 x(t) = е sint 
I - e ‘(cost+ sin?) 
5+2 


H(s) = ———— 
(s+1) +1 


e "(cost — sin f) 


30 1,-1,—2;[1 0 -1],[1 -1 
и@) = —6{х(@) + х„(@)} 


зо | 
у= 0] К 


_ 5+3 
(Е С 


0]7, [0 о 1 


К 


s+(1+KK,)s+K 
(с) К = 12.5, К, = 0.178 


33 (а) K, = М,0? 


34 (b) Unstable 
(d) -8 dB, 24? 
(е) К= 10° т = 10%, 2 = 107, = 4х 10* 
(£) 5 + 36 х 1082 + 285 х 1025 
+25 х 10'%1 + 1078) =0 


32 (а) 


(d) 0.655, 2.485, 1.865 


(c) В= 2.5 x 105, 92 dB 





CHAPTER 6 
Exercises 
42 1 Z 
1 (a) go lA (b 12123 





11 


12 


13 


14 


15 


16 























P И ae 

(©) DU (d 5, 121 
(е) 3—2—, |212 1 

(2-1) 
ey o 

z-e^? 

1 22 | 2 —. 
2 22-1 z(2z- 1) 

52 = 
(а) 5:41 (b) Е 

27 2z 

22-1 (2z - 1 

-4kT z 
(а) {е јо = 

2-е 
(b) {sin kT} o T 
z -2zcosT+1 
(c) {cos 2kT} e —2E= cos 27) 
z -2zcos2T+1 

(9) (Ы (1) о к 
(е) iP (f) (-jy2)* 


(g) 0 (k — 0), 1 (k 7 0) 
(h) 1 (20), (CD*' (& 7 0) 


(a) i1 - (2) (Ы) ip^ - 75] 
(gee (d) $ «$C D) 
(e) sin ikm (£) 2“ sin ikr 

(g) ik- 1d - 35 (b) k& 21 cos(3k — л) 


kel 


(a) (0,1,0,0,0,0,0,2) 

(b) (1,0,3,0,0,0,0,0, 0, —2} 

(c) {5,0,0,1,3} (d) (60, 1,114 (D 
(e) 120), $e 1) $22) -1-D (k = 3) 


0 (К = 0) 
f 
B Lm (k 2 1) 
0 (k- 0) 
ш) м (К > 1) 


1 = 1 b= 
Via + Ver 7 Xi Уно t 4A 7 $Yk 7 ХЕ 


(а) к=к (6) у= 509°) + 9(-1)* 
(c) 2*"sintkn — (d) 2(-D'«3* 


(а) x,22 7D - £O 1 
(b) y»,22(35 - 625 «3 
(c) ¥n=2(3") — 3(2") + 404)" 
(d) y, 2 -2(43)"' sin Anz 1 


(е) y,=—2(-})" + 2(2)"- 2n-1 
(f) y,=—3[2" + (-2)"]+1-n 


17 (b) 7, £4841 
18 уџ= 2 – 13%) +1 
19 As k — eo, I, > 2G as a damped oscillation 


м (а = 
2 - 32+ 2 
2-1 
zZ-3z41 
2+1 


2-2 +22+1 


(b) 








(b) 2(35 sin (К+1)т 
(d) gil 4 ok 


23 (а) (7D - (-55 
(c) 30.4)  1(-0.2)* 


i о (k = 0) 
at cS 


f (k=0) 
27! (k=l) 


25 (a), (b) and (c) are stable; (d) is unstable; 
(e) is marginally stable 


26 2- (bf 


28 y, 2 -4G) * 2G) * 2" 


30 (a) ТИ PF | > {| 
4\4 


2 4 |-4 2 
(b) © 1 H zl | 
2 |1 1 11 


(c) zu 1 
0 1 


31 x(k) = 5*(coskO+ sink@), y(k) = 5*(2 cos k0), 


=a? 
cos 0 = —; 
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25 _ 11(_¢.2)* + 20.8) 
32 x(k) = uc ea Ee 
2 - (34/6)(-0.2)* - (17.6/9)(-0.8)* 


А ЧН] 





за DT]. 1 l(1- 67D) xi (4T) 
x5[(k +1)T] 0 e X(kT) 

Iry He? 
EG kr) 

101-е°7) 


35 (а) x [(k +1)T] _| 1 T | x(XT) + 0 u(kT) 
xl(ket1)TI] |-7 i1-T|x(kT) T 
MAT) =[1 0]x(&T) 
(b) х[(& + 1)Т] = Gx(kT) + Hu(kT) 
MAT) =[1 0] X(T) 
_ BITE * 5sin(2T) 
-4sin(2T) 
Asin T) | 
cos(2 T) - 5sin(2T) 
™ | - €™ cos(2T)- 46” d 


-Т/2 . | 
26 sin(2T) 


37 (ау х(Е+1)= is eos m ав 


0.632 0.3684, 
(b) x(&-- 1) — 0.368. -0.1185| x (X) 
0.632 1 x(k) 


,) 01185 0 ke 
0.069  -1|1.1x,(0) 


(c) x(t) =x,(O)[1.1 — 2.15e" + 2.05e 2^] 
x,(t) = k, + x,(0)[-5.867 4- 8.6e ^! — 2.71e?^] 


38 q form: 
(44 + Ва + С)у,= Аф + 29 * Vu, 
6 form: 
[AA25? + (2AA + AB)S+ (4 + B+ C)]y, 
2 A4 + 4A5 + А252)и, 
А= 23 + 6А +4 
В= 4л – 8 
С= 2А – 6А + 4 


1 


5 +252625 + 1 
[(А? + 4А? + 8А + 8)5° + (6А? + 16А + 16)? 
+ (12А + 16) + 8] y, = (2 + TS Yu, 


39 
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2 
41 12 - z) 
(12+ 5A)z + (8A - 12)z- 


12y(1 Ay) 
A(12 + 5A)y° + (8A - 12) y+ 12 


6.12 Review exercises 
4 342k 
5 1 3(-2)*- 1-1)“ 


7 2z B 


(z - ey? zc 
1 n n 
8 (a) [e ZÜ 
(b ()3*k Gi) 244 sin ikn 
9 i-1(-p'-2f 
10 (-1) 
13 14[2 - 2" - n()"] 
170 3- 5p 2 1j o0 1f 





-2T 





st -3042 
x(k) = 1-2* 
CD.) 


18 D(z exe (i) "=l; E 


22 +42-5 2 
3 
(ii) M;'- f (ii) vb =[0 —3] 
о 
2 
_1 11 
üv)T - : () T'- 
1 -2 0 
a=-5, B=4 
CHAPTER 7 
Exercises 
1 (a) ft)» -la _ 2 w cos(2n- Di 
4 m^ (2п-1) 
у 3sin(2n - l)t sin 2nt 
- 2n-1 2n 


_ 2 cos(2n - D sin nf 
b) f(t) 2 n 7 У S — 
RIS D (2n - 1? n 


n=1 n=1 





a2 sin nt 
(c) f(t)= 2 TS 
(= В cos 2nt 


E 


mm 4n -1 
yc B cos nt 
rm 4? -1 
_ 4 cos(2n- It 

2 (2n - 1y 
(p ft) --5 y, m 5 -S ч 


(h) f(t) = (is + sinh) 


(9 0 = 





Ale ale” 


(е) 0) = 


ыле алы аты 


(f) f) 


п=1 п=1 


+25 се! CUPS on 
T 1 ml 

2 п(-1)" : 
== sinh 1 sin nt 
a +1 


1 


2 foi SEES 
Taking t — n gives the required result. 


3 q) =0| 1-4 D P 
2 (2n - 


n=1 


10 cos 2nt 
4 A=? + Í sint- —— 
2 re -1 


5 Taking / — 0 and t = T gives the required answers. 


6 A= -25 cos(4n - ~ 
T (2n - 


п=1 
Taking t — 0 gives the m series. 
7л0=2+2 у 8%21- D: 
2 m ^4 (n-1 
Replacing t by г - іл gives the following sine series 


of odd harmonics: 


((-1)-3--5 y Etfi ne 


n (2n- 1) 


n-l 


21x (21! плі 
8 (0) 2 пт! 
f(t) т 2 Ha 


2К җе 1. пл 
9 f()) 2 5 V. 5 sin Œ 
До) 7 


n=1 


3.6% 1  sin(2n- l)nf 
iie EN телш 
= a a-i 5 


cos2nat 
An? - 1 


A 1 
11 v(t) 2 | 1 c ^xsinot - 2 
4 | > 


2 о п 
12 f= tr + tty CP cos 
3 


n 


n-l 





15 fi) - -5 Y 1 s eosQn - nt 
n 


= 


-Ly ios 2nmnt 
п=1 

l1. 

= = sin 2ntt 
tale 


x 1 sin 2ntt 
n 


(b) f(t) = 


Qu 


n-l 

NES 4 

n 2 Ее 2п-1 ет 
x sin(2n — l)nt 


E п+1 
(с) fa) 3 y C cos nat 
з у n 


17 KA= am 2- Ș 4 cos 2nt 
n=1 n 


8 1 . 
А) = : » TET sin(2n - 1)t 


18 f(x psu mS 


- 1)? 
19 ®=® x MID ш 


п+1 
20 foe sinet 1 ICE" sin 2n 
4n -1 


n=1 


2 fo)=-14-4 Y —— cos 2 Dr 
2 


т? 


22 TG TELA 2c 1 = ; si Qi lynx 


sin 2nnt 





23 f(t) = 


i 


+toosm+2y 1 
2 п An -1 


= 1 
Lin 





a IN 


1 sin(2n - 1)лі 
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п+1 
26 (с) 14У CD sin nf 


n=1 


1.2 c 2 n 
29 (a) =n + > —(-1) cos nt 
6 = п? 





n 


(b) а„=0 


E = Des 2(- S inm 


a lr 


ar jn) 
n=—| COS NT - cos =nT 
пт 2 
2 
T 2(25 sin lam - I. соз lan 
An? 2 8n 2 
+ 3 cos lan = 2 sin Lnn) > 
п 2 тп 2 


2 


— (2.1 ui "Jsinar «| 


-45 e У sin(2n - 1)t 
= T (2n-1) 


- (= a 16) sin t4 1032 + л? - бл) віп 21 


п=1 п=1 


@ 1425 
4 m ia (2n- 





5 608 2(2n - Dnt 





30 es 54 29 Y l | sin(2n - 1)100x: 
п 72) 2n- 1 
i,(t) ~ 0.008 cos(100nt — 1.96) 
+ 0.005 cos(300nt — 0.33) 


31 f(t) = Y Z3 sinn - 9r 


n-l 


x, (f) = 0.14 sin(nt — 0.1) 4- 0.379 sin(3nt — 2.415) 
+ 0.017 sin(S5at — 2.83) 


100 — (-1)"" 
32 f(t) 2 — 2nnt 
S(t) * 2. sin 2nn 


x,(f) = 0.044 sin(2nt — 3.13) — 0.0052 sin(4nt – 3.14) 
33 e(t)= 100 , 50 sin 50nt - 200 x 2253 cos [бш 
Т T n-l 4n -1 


i(2) = 0.78 cos(S0nt + (—0.17)) 
— 0.01 sin(100m7 + (—0.48)) 


_1, хс j p_i qma 
35 л9=5+ 5, ale Die 
n#0 


36 (a) ny im. luec vie" 


п=—со 


n#0 
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b) Ssi = 
е У 


п=—оо 


-— C- D'« 1] e" 
2n(n - 1) 


3 = ES ету jnt 
(c) 2 s -D']e 


n#0 
(d) 2. 1 - ein 
Tj. 1-4n 
38 (b) (i) 17.74, (ii) 17.95 
(c) 18.14; (i) 2.20%, (ii) 1.05% 


39 (a) c= 15, in - e72) 


15, 2a -j hc D ej). 0, Sj) 


(b) 15 W, 24.30 W, 12.16 y 2.70 W, 0.97 W 
(c) 60W 
(d) 91.9% 


40 0.19, 0.10, 0.0675 
41 (c) c, 20, E сз=—[ 


5 
»O=7,0,=0 


42 (c) со=1, су = 


Nie 


46 (b) со= 0, с = \(2m), c, = 0, MSE = 0 


7.9 Review exercises 


1 f()- "i "A =(—1)" cos nt 


| T gp te ov 
2n-1 m(n- 1) 


= sin 2nt 
2n 





+ 


мм 


п 


Taking Т = x gives the required sum. 


2 0) = >, 4 cos sree s+ 9^1 eos 


Tín 


2 
КШ 


п+1 
3 (a) Ane y e in г ee 
(em 
(c) Taking г = 17 gives S = In? 


4 9 (-D)'sin(2n - 1)t 
5 1) = – 
f(t) 2. n- 


8 f(x)-- a Б ; sin(2n - 1)x 


n-l 





(isan > cos 2(2n a 
4 (2n - 1) 


n-l 


10 (а) Дй= » 2 sin nt 


n=1 


1 4 <= 1 
b == z 
® Л) Е 





y cos(2n - 1)t 





13 (а) fü- T - 2 ; cos(2n - Dt 








(2n - 1) 
(b) КОР? 1 sin(2n - 1)t 
т 20 - 1 
15 (а) v) 0 i ssin 2T 20 y: —L— cos SERE 
т т4{47-1 Т 
(b 2.5W, Dome 
16 (b) s()-3 y, l.. sin(2n - 1 
n 2n-1 
f) - 1g) 
18 (b) асыла 2 sin Œt - & cos ot 
1+@ n (2n-1)(1 +0?) 
«= (4п — 2)л/Т 


19 (с) Т,=1,Т,=,Т,=22—1,Т,=4— 3 
(D Ln +T -in +57 -PT 


(е) S? -5P «94-9, M, 5-1 


CHAPTER 8 


Exercises 


1 Za - 
а+@ 
2 AT^jo sinc 27 
3 AT sin’ 


4 8Ksinc 2a, 2K sinc @, 2K (4 sinc 20 — sinc @) 


5 4sinc 0— 4sinc20 





70 

(a 4 jo) 4- o», 
еа 
X ta x +a 


1 


тт 
(1- 0) +3јо 


13 4sinc20 – 2 ѕіпс 0 


14 17 [5іпс 100, - @)T + sinc }(@ + @)T] 
15 Lpe ser [e ^7? 
? -j@gT/2 


sinc (o - @)Т 
+е sinc }(@+ @)T] 
16 j[sinc(@+ 2) — sinc(@— 2)] 
18 4AT cos wtsinc oT 

19 High-pass filter 

20 ле"! 


21 T[sinc(@— @))T + sinc(@+ @)T] 





26 12)15(0+ @) - &(9 - 9) - 
2 09-0 


28 {2, 0, 2, 0} 
29 {2, 0, 2, 0} 


32 D(z) = 0.06366 – 0.106602? + 0.318312* + 0.52° 
+ 0.318312% – 0.0106602* + 0.063662 ? 


33 D(z) = 0.00509(1 + z '?) — 0.04221(2? 4 2*) 
- 0.29035(z ^ - z $) 4 0.52? 





8.10 Review exercises 


1 511.0 _ С0520 
2 
w @ 


2 —“1зїпс2@ 
@ 


7 (а) ——(е" – е")н(0) 
a-b 
(b) () re"H(f) (ii) (t(- 1+ eA) 
8 (a) —sinoy(£ in) —— (b) cos cof 
OB (d) —je 


+275 
17 (а) At ens 
а? + Ans? 


(b) —. (sin2nsT - cos2msT + 1) 
2Ts 


CHAPTER 9 


Exercises 
1 2= Ре? 
2 а= +їс 


5 For a=0:V=A+Bx 





10 


12 


15 


16 


19 


20 


21 


22 


23 


25 
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For a > 0: 

V = Аѕіпһаї + В соѕћ аї, ућеге а? = а/к 
ог Се“ + Пе“ 

For æ < 0: 

V — Acos bt * Bsin bt, where P 2 —a/k 


n=-3,2 
a=-3 
(a) 4, — (L0), 


(b) va = (rg + (rejv, and oW, = Wa 
(с) Wa = (Le)wy 


g(2) = (1 + 22)/(1 + 2 


u = sin x cos ct 


T : : ; 
u- Gin x sin cf + i sin 2x sin 2cf) 


и = E sae) ct) Esin 3n(x - ct) Wa E 
T 


+5 sin Ee а +.. | + = ate) 


_ 13те + y 1 шәл Ed, 
9 1 25 1 2 


u 7 i [exp(-(x — ct) - exp(-(x * c3] 
u=; Fœ- ct) + i F(x ct), where 

ler (ers) 
F)= 142 (-l<z<0) 

0 (12| 2 1) 


x + (-3 — /6)y = constant 
and 
x + (-3 + /6)y = constant 





u = 1[4(x + 20) - (x — 30 — 5] 





x F(x) u(x,0) — u(x, 0.5) u(x, 1) u(x, 1.5) u(x, 2) 
-3.0 0.024893 0 0.025943 0.058 509 0.106010 0.180570 
-2.5 0.041042 0 0.042 774 0.096466 0.174781 0.297710 
-2.0 0.067667 0 0.070 522 0.159046 0.288166 0.490842 
-15 0.111565 0 0.116272 0.262222 0.475106 0.681635 
-1.0 0.183939 0 0.191700 0.432332 0.655692 0.791166 
-0.5 0.303265 0 0.316060 0.585169 0.748392 0.847392 
0 0.5 0 0.393469 0.632120 0.776869 0.864664 
0.5 0.696734 0 0.316060 0.585169 0.748 392 0.847392 
1.0 0.816060 0 0.191700 0.432332 0.655692 0.791166 
15 0.888434 0 0.116272 0.262222 0.475106 0.681635 
2.0 0.932332 0 0.070 522 0.159046 0.288166 0.490842 
2.5 0.958957 0 0.042 774 0.096466 0.174781 0.297710 
3.0 0.975106 0 0.025 943 0.058 509 0.106010 0.180570 
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x t=0 12025 1205 t=1 t= 15 

0 0 0 0 0 0 

0.25 | 0 0.0625 0.125 0.179687 0.210937 
0.50 | 0 0.125 0.21875 0.265 625 0.269531 
0.75 | 0 0.0625 0.125 0.179687 0.210937 
1.00 | 0 0 0 0 0 











x t=0 t= 0.25 t=0.5 t=1 

0 0 0 0 0 

0.25 0 0.0625 0.122 45 0.174 07 
0.50 0 0.125 0.224 49 0.2815 
0.75 0 0.0625 0.122 45 0.174 07 
1.00 0 0 0 0 





29 Explicit 








x t=0 1= 0.02 t = 0.04 t=0.06 t — 0.08 
0 0 0.031410 0.062790 0.094108 0.125 333 
0210 0 0.000314 0.001 249 0.003 101 
0.4 10 0 0 0.000 003 0.000018 
0.6 | 0 0 0 0 0.000 000 
0.8 | 0 0 0 0 0 

1.0 | 0 0 0 0 0 





30 Explicit 








x t=0 t= 0.2 t= 0.4 t= 0.6 
0 0 0 0 0 

0.2 0.16 0.19 0.2725 0.388 75 
0.4 0.24 0.27 0.36 0.508 125 
0.6 0.24 0.27 0.36 0.508 125 
0.8 0.16 0.19 0.2725 0.388 75 
1.0 0 0 0 0 











x t=0 t= 0.2 t= 0.4 t= 0.6 
0 0 0 0 0 

0.2 0.16 0.19 0.2319 0.2785 
0.4 0.24 0.27 0.3191 0.3849 








31 Explicit 





x t=0 £202 t=0.4 t=0.6 
0 0 0.03 0.12 0.27 
0.2 0.16 0.19 0.28 0.43 
0.4 0.24 0.27 0.36 0.51 
0.6 0.24 0.27 0.36 0.51 
0.8 0.16 0.19 0.28 0.43 
1.0 0 0.03 0.12 0.27 
Implicit (symmetric as in the explicit case) 

x t=0 t=0.2 t=0.4 t= 0.6 
0 1 0.03 0.08 0.1495 
0.2 0.16 0.19 0.24 0.3099 
0.4 0.24 0.27 0.32 0.39 





32 u- la[exp(-$ km'f) cos? x 
+ exp(—} KT?) cos } 1x] 

33 Ay = 2/nN 

34 а= -1, к= –1 

35 В= 2, и= —uye"sin(x — 2f) 


36 The term represents heat loss at a rate proportional to 
the excess temperature over 4). 


= -k(n 4 it 
37 и= Уа, ехр B [o De 
1 2/1 


0 DS 
(2п + 1)2л2 (2п+1)л 


сж a 


39 u(0, t) 2 (1, f) 2 0 for all t 
u(x, 0) 2 10forü0 € x «I 








41 
x (0  t-2002 120.004 120.060 ¢=0.08 1-041 
0 0 0 0 0 0 0 
02 | 0.04 0.08 0.1 0.12 0.135 0.1475 
04 | 016 02 0.24 0.27 0.295 0.315 
0.6 | 0.36 04 0.44 0.47 0.495 0.515 
0.8 | 0.64 0.68 0.7 0.72 0.735 0.7475 





42 Att=1 with A — 0.4 and At — 0.05 
Explicit 


0 02 0.4 0.6 0.8 1.0 
0 0.1094 0.2104 0.2939 0.3497 0.3679 


u 





43 


44 


46 


47 


48 


50 


51 


54 V 


55 


56 


57 











Implicit 
x|O0 02 0.4 0.6 0.8 1.0 
u |O 0.1082 0.2095 0.2954 0.3551 0.3679 
x t=0 ¢=0.02 120.04 t large 
0 0 —0.04 —0.0799 ә -l1 
0.2 0.16 0.12 0.0803 — -0.8 
0.4 0.24 0.2002 0.1613 — —0.6 
0.6 0.24 0.2012 0.1657 —  -04 
0.8 0.16 0.1269 0.1034 —  -02 
1 0 0 0 > 0 
u — ie "sinnx- ae” sin 37x + Бе?” зїп 5ЛХ 
ф=х?у+ зїп qa ERE) 
inh 7 

r py 
u(r, 0) - -sin0 — (2) sin(30) 

a a 


v — const gives circles with centre (=, 0) and 
radius I/|v — 1| 

u = const gives circles with centre (1. =) and 
radius 1/|u| е 


Sin nzx. 
и=х+® D ous 
nsinh nt 


x (disi nny + (-1)' sinh mm(1 — y)j 
Boundary conditions are и(0, у) = и(а, у) = 0 


0 = у < а; апа и(х, 0) = 0, 0 = х <S a, u(x, а) = и, 
0<х<а 





For Ax = Ay = 0.5 
For Ax = Ay = 0.25 


u(0.5, 0.5) = 0.3125 
u(0.5, 0.5) = 0.3047 


At two sample points 

For Ax 2 Ay 2 5, u(0.5, 0.5) — 0.6429 and 
u(0.5, 1) 2 0.5714 

For Ax = Лу = 1, u(0.5, 0.5) = 
u(0.5, 1) = 0.5602 


0.6379 and 


u(1, 1) = 10.674, u(2, 1) = 12.360, u(3, 1) = 8.090, 
u(1, 2) = 10.337, u(2, 2) = 10.674 





58 


59 


60 


61 


62 


63 


64 


66 


68 


69 


71 
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h = 1/2 gives (0.5, 0.5) = 1.8611 and 
ф(0.5, 1) = 1.3194 
For h = 1/4 @ is given in the table 





1.6015666 1.2867647 1.0565216 1 
1.967 9551 1.5818015 1.2572287 1 
2.2665772 1.846 5073 1.4374669 1 
2.5174715 2.1314338 1.693 0064 1 
2.75 2.5 2.25 1 
х 0 0.25 0.5 0.75 1 


(0, 0) = 1.5909, ф(0, +) = 2.0909, ф(0, 2) — 4.7727, 
(4, 0) = 1.0909, ф(#‚ 0) = 0.7727 and other values 
can be obtained by symmetry. 


(а) щш, = 1/35, и, = 6/35 

(b) щ = 0.1024, и, = 0.0208, и; = 0.2920, ш = 0.2920, 
us — 0.0208 

Has the same solution as Exercise 57. 

и(0, 0) = 1.6818, u(0, 1) 2 2.2485, u(0, 2) 2 5.3121, 


u(i ,0)2 1.1152, и(2 ,0) 2 0.7727 and other values by 
symmetry. Compare with Exercise 59. 


Tta an (ezr гапу) + а (20) 
T a 2 a-r 2 
2 an (že) 

T Уо 


Т(х, у, 2) = 


T(r, 0) = 





dni | 2+ 1, ) 
2. скт (x - ay. y? 


= sine (—2— = ;) 
(x-a) +y 
41 sini ( 2+1 ) 
4рскт (x tay +у? 


z-L ) 
(x tay ry 


Parabolic; r = x — y and s = x + y gives u, = 0 
Elliptic; r = —3x + y and s = x + y gives 

8(u,, + U,,.) — 9u, + 3u, +u=0 

Hyperbolic; r = 9x + y and s = x + y gives 49u, — u, = 0 





= sini ( 





и=/(2х + y) * gx — 3y) 


(a) elliptic; 
(b) parabolic; 
(c) hyperbolic 
3 


For y « 0 characteristics are (-y)? * 5X — constant 
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72 elliptic if|y| <1 
parabolic if x = 0 or y = +1 
hyperbolic if |y| > 1 


73 p > qorp « —q then hyperbolic; p = q then parabolic; 
—q < p < q then elliptic 


9.10 Review exercises 


Е : 1 : 
3 y-4e ^" sin(313 (cos o t4 —— sino r) 
7 l 3 20у 5 


2 1/2 
where @; = эл(1 - xx) 
1 BON CT 


5 4,47 80y/r + 1)? 


6 T- T, 4 ф[1— егї(х/2 ү(кї))] 














7 Explicit 

X t=0 t= 0.004 t= 0.008 
0 1.0000 0.9600 0.9296 
0.2 1.0000 1.0000 0.9960 
0.4 1.0000 1.0000 1.0000 
0.6 1.0000 1.0000 1.0000 
0.8 1.0000 1.0000 0.9960 
1.0 1.0000 0.9600 0.9296 
Implicit 

x t=0 t= 0.004 t= 0.008 
0 1.0000 0.9641 0.9354 
0.2 1.0000 0.9984 0.9941 
0.4 1.0000 0.9999 0.9996 
0.6 1.0000 0.9999 0.9996 
0.8 1.0000 0.9984 0.9941 
1.0 1.0000 0.9641 0.9354 

9 
у=1 1 0.928 592 5 
у= 0.5 0.987 5743 0.9569621 0.937999 5 
у= 1 1 0.9849808 0.964 7746 0.960 1934 
х=0 х= 0.5 х= 1 yaks 
11 k= =; 


12 2= х — y, valid in the region x > у 


2 nl 
14 4а = itl 
nm (2n+1) cosh | 213 Det 


a 





15 u(x, t) - у l sin nx cos nnt 
Ten 


п=1 
17 ó— Acos(px)e "cos ot, where o? — cp! - 1 K? 


18 Onr-a, v, — 0, so there is no flow through 
the cylinder r = a. As r — «o, v, — U cos 0 and 
Ug — —Usin 6, so the flow is steady at infinity and 
parallel to the x axis. 


CHAPTER 10 


Exercises 
1х=1,у=1,/=9 


3 Original problem: 
20 of type 1, 50 of type 2, profit 2 £1080, 70m 
chipboard remain 


Revised problem: 
5 of type 1, 75 of type 2, profit = £1020, 5 m chipboard 
remain 


4 4kg nails, 2 kg screws, profit 14 p 
5 9 of CYL1, 6 of CYL2 and profit £54 


6 LP solution gives x, = 66.67, x, = 50, f= £3166.67. 
Profit is improved if more cloth is bought, up to a 
maximum when the amount of cloth is increased to 
600 m then x, = 0, x, = 150 and f= £4500. 


7 Forkz 60x, 29 2,550 a TK 
For 60 = k = 10: x, = 6, x, = 7, z = 140 + 6k 
For < 10: x, = 0, x, = 10, z= 200 


8 x, =1, x, =0.5, x, = 1, x4 =0, f= 6.5 
9 B1, 0; B2, 15000; B3, 30000; profit £21 000 


10 Long range 15, medium range 0, short range 0, 
estimated profit £6 million 


11 Many solutions of the form x, = 1.5 — 1.54, x, = 0, 
хз = 2.5 — 1.51, х = 3t where 0 St S 1 giving f= 14. 


12 х= 1,у=4, {= 9 
13 x =1, x = 10, f=20 
14 Boots 50, shoes 150, profit £1150 


15 B1, 0; B2, 10 000; B3, 40 000; profit is down to 
£20 000 


16 х= 3,у=0,2=1, /= З 
17 x,22,x520,34,22,x,20, f - 12 


18 36.63% of A, 44.55% of B, 18.8196 of C, profit per 
100 litres £134.26 


19 6 of style 1, 11 of style 2, 6 of style 3, total profit 
£37 500 


20 x, = 2500 m’, x, = 1500 m’, x; = 1000 m’, profit £9500 
21 х= 1,у= 

22 х= +аапіу= 0 

23 x - a/q2, y — b/ 2, area — 2ab 


24 Several possible optima: (0, 3, 0); (3,5, 15; 
(6 — 31, 0, f) for any f 


25 (0, l; 1); (0, -l, =l; (2, =l; DAJ/T; -Q. =l; DA? 


26 For given surface area S, b = c = 2a, where a’? = 1S 


and V 2 4d? 


ue 
12 
27 А= –1.83, В = 0.609, I = 81.4 


28 Гог о 2 0 minimum at (0, 0); ог о < 0 minimum at 
(20, -30)/5 


29 (a) Bracket (without using derivatives) 0.7 < х < 3.1 
(b) Iteration 1: 


a 0.7 f(a) 2.7408 
b 1.9 f(b) 2.177 
с 3.1 flo) 3.2041 
Iteration 2: 

a 0.7 fla) 2.7408 
b 1.7253 f(b) 2.0612 


c 1.9 fe) 2477 


gives b = 1.5127 and f(b) = 1.9497 


(c) 


Iteration | Iteration 2 





x 0.7 3.1 0.7 1.5129 
f(x) 2.7408 3.2041 2.7408 1.9498 
f(x) —4.8309 0.9329  —4.8309 0.4224 
gives x = 1.1684 and f= 1.9009 


30 (a) Iteration 1: 


a 0 f(a) 0 
b 1 fb) 0.420 74 
e 3 Ќо) 0.01411 
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Iteration 2: 


a 0 f(a) 0 
b 1 f(b) 0.420 74 
c 1.5113 Ќо) 0.303 96 


gives x = 0.989 79 and f= 0.422 24 





(b) . | 
Iteration 1 Iteration 2 
x 0 1 0 0.8667 
F(x) 0 0.4207 0 0.4352 
Г) 1 —0.1506 1 —0.0612 


gives x = 0.8242, f= 0.4371, f' 2 — 0.0247 
31 (a) Iteration 1: 


a 1 f(a) 0.232 54 
b 1.6667 f(b) 0.255 33 
С 3 flo) 0.141 93 
Iteration 2: 
a 1 f(a) 0.232 54 
b 1.6200 f(b) 0.257 15 
c 1.6667 flo) 0.255 33 
gives x = 1.4784 and f= 0.260 22 

(b) Iteration 1: 
x 1 3 
fo) 0.232 54 0.141 93 
Го) 0.135 34 — 0.087 18 
Iteration 2: 
x 1 1.5077 
Хх) 0.232 54 0.25990 
fx) 0.13534 —0.013 68 


gives x = 1.4462, f= 0.260 35, f’ = —0.000 14 
(c) Convergence in 6 and 3 iterations to x = 1.446, 
f= 0.2603 


32 х= 1, А = 2; х= 0, А. = 1.41421; x =-1, 
A max = 1.732 05. One application of the quadratic 
algorithm gives x = —0.148 26 and À max = 1.3854. 


34 (a) After five iterations x = 2.0492 and f= 1.8191. 
(b) After five iterations x = 2.1738, f= 0.0267059. 
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35 


36 


37 


38 


Iteration 1: 


а= 


== 
1—1 ? 





© 
c 
is 
I 
"E 


Steepest descent gives the point (—0.382, —0.255) and 


el 
a=| 5 ; = -0 
—3 
L 5 
f= 0.828 
f -29.0000 —1.5023 
x 2.0000 1.1523 
y 2.0000 2.1695 
z 2.0000 0.4741 
f 29.0000 1.2448 
x 2.0000 0.2727 
y 2.0000 1.7273 
2 2.0000 1.4545 


—0.4523 —0.0764 —0.0248 


—0.0165 
0.5185 
1.6394 
1.0301 


20 
0.5 
>15 
>1 


0.5022 
1.8214 
0.7943 


0.6214 
1.7539 
0.9630 


0.4948 
1.6654 
1.0170 


0.1056 
0.4245 
1:5733 
1.1510 


0.0026 
0.4873 
1.5127 
1.0253 


0.0000 
0.4995 
1.5005 
1.0009 


>0 
— 0.5 
— 1.5 
—1 


39 у = 0.2294х апа у = 0.5x — 0.2706, cost = 5.974 


40 


(a) After step 1 


e Xm. urs f | 
2 1.6 


H- 0.1385 0.1923 
0.1923 0.9615 
After step 2 the exact solution x = 1, y= 1 is obtained 


(b) After cycle 1 
0.5852 


4,—]0 > 
0.2926 
0.3681 
0.1593 
—0.4047 
After cycle 2 
1.0190 
0.9813 
—0.0372 


H,= 


а, = 


—0.3918 

f-1.0662, g) = |—1,7557 
0.7822 

0.1593 —0.4047 

0.9632 0.1002 

0.1002 0.7418 

, f- 2.999 x 105, 





0.0046 
$2 — |—0.0012 
0.0027 


41 (a) After cycle 1 


i 0.485 , fe02494, gc 0.970 
—0.061 —0.242 
H- 0.995  —0.062 
—0.062 0.258 
After cycle 2 the minimum at x = 0, y = 0 is 
obtained 
(b) After cycle 1 
—0.0732 0.0386 
а= | 0.8344], /= 0.1563, = [0.1564 
0.4522 0.6296 
0.4425 0.3669 0.0998 
Н, = | 0.3669 0.7585 —0.0657 
0.0198 —0.0657 0.9821 
After cycle 2 
—0.1628 —0.2006 
а= | 0.7747|. f- 90.0321, Ф = |-0.1207 
0.0525 0.0630 
0.2820 0.1819 0.0498 
Н, = | 0.1819 0.5452 0.2380 
—0.0498 0.2380 0.8429 
The method converges to x = 0, y= 1, z = 0. 
43 (a) (0, 0) 
(b) 
f 1.3125 0.0764 0.0072 0.0007 0.0004 0.0000 


x 0.5000 —0.0950 0.0057 —0.0251 —0.0079 —0.0032 
y 0.5000 0.9165 0.9276 0.9633 0.9742 0.9978 
z 0.5000 0.7380 0.9674 1.0009 1.0044 1.0014 


10.7 Review exercises 
1 x, =250, x, = 100, F = 3800 
2 x, =22, x, =0, x; = 6, profit £370 
3 Standard 20, super 10, deluxe 40, profit £21 000 
4 2kg bread and 0.5 kg cheese, cost 210p 


5 Maximum at (1, 1) and (-1, —1), with distance — (2 
Minimum at (1, —ү) and (-/2, (D), with 
distance = ү? 


—0 
—0 
—1 
—1 


6 Sides are 3/4 and 241 
7 (4,0, $), with distance 2.683 


8 (1, 2, 3) with F — 14, and (-1, 2, —3) with F — —14. 
(1, 2, 3) gives the global maximum and (0, 0, 0) gives 
the global minimum. 


9()b-2c (i)a-bz-c 


10 2 = Зл2/Ь, г? = За?/2Ь 


11 = 2.19 

12 Bracket: 
R 3.5 5.5 9.5 
Cost 1124 704 1418 





Quadratic algorithm gives R = 6.121 and cost = 802, 
so R=5.5 still gives the best result. After many 
iterations R = 4.4 and cost = 579 


13 Quadratic algorithm always gives x = 0.5 for any 
intermediate value. However, 


0.7729 0.7584 0.7524 0.7508 — 0.7500 
0.3147 0.5000 0.5629 0.6051 
0.5000 0.5629 0.6051 0.6243 — 0.6514 
1.0000 1.0000 1.0000 1.0000 


aera SS 


14 Maximum at 0 — 5.01 rad, minimum at 0 — 1.28 rad 


15 44 mph 


16 At iteration 2 
(a) x =—0.0916, y = —0.1375, f= 0.0326 
(b) x 2 —0.1023, y 2 —0.1534, f— 0.0323 
(c) x « —0.1007, y = 0.1519, f= 0.0323. The exact 
minimum is at x = —0.1026, y = —0.1540, f= 0.0323 


17 Maximum of 1.056 at X 2 0, Y 2 0.4736, minimum of 
0.5278 at X 2 +0.25, Y= 2 


18 Partan 


- AF 2 е e f-05 
0 0 


se f - 0.25 Н! = 0 
0.5 1 


Steepest-descent 
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х,= |0? , f=0.25 
| 0.5 

eye |" z=0125 
| 0.5 


19 Start values: 


a, = Wu f=1 (по improvement) 





А=1 
а = ‘ , f=-5 (ready for next iteration) 
2 
D 
20 (a) ао = 5 Е = 0.0625 
|0 
4| "9L во 
| -0.25 
а, = | 037. p 0990 
| 0.375 
[1 
(Б) а = | Е = 0.3125 
1 
а =|°%???|, F=00664 
| 3.667 


0 

а = A , F=0.239 
0.27 

ае Fan = 0.032 

0.956 
заа | 

0 

а= |05, у= 05 
0 
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24 Bracket gives 


e [F(o) - 3]' 
14 0.0776 
1.5 0.0029 
1.6 0.0369 


Quadratic algorithm gives a@* = 1.5218 and f=9 x 10° 


CHAPTER 11 


Exercises 
1 (a) (762, 798) (b) 97 
2 76.1, (65.7, 86.5) 
3 (8.05, 9.40) 
4 (71.2, 75.2), accept 
5 (2.92, 3.92) 
6 (24.9, 27.9) 


7 95% confidence interval (53.9, 58.1), criterion 
satisfactory 


8 (—1900, 7500), reject 


9 90%: (34, 758), 95%: (—45, 837), reject at 10% but 
accept at 5% 


10 90%: (0.052, 0.089), 95%: (0.049, 0.092), reject at 
10% but accept at 5%. Test statistic leads to rejection 
at both 10% and 5% levels, and is more accurate 


11 203, (0.223, 0.327) 


12 9096: (—0.28, —0.08), 9596: (—0.30, —0.06), accept at 
10% but reject at 5% 


13 (0.003, 0.130), carcinogenic 


14 (a) X: (0.34, 0.53, 0.13), Y: (0.25, 0.31, 0.44) 

(b) 0.472, (c) E(X) = 1.79, Var(X) = 0.426, 
E(Y) = 2.19, Var(Y) = 0.654, py y= —0.246 

17 (a) 0.552, (Ы) 0.368 

18 0.934 

19 0.732 

20 (0.45, 0.85) 

21 (0.67, 0.99) 


22 0.444, 90%: (0.08, 0.70), 95%: (0.00, 0.74), just 
significant at 5%, rank correlation 0.401, significant 
at 10% 





23 (a) 6, (b) 0.484, 
(c) fra) = 6(5 —x + 5x7), f(y) = OC — yy 
24 0.84 
25 a=1.22,b=2.13 
26 a= 6.315, b = 14.64, у = 226 


28 (а) а= 343.7, 2 = 3.221, у = 537; 
(b) (0.46, 5.98), reject; (c) (459, 615) 


29 а= 0.107, b 2 1.143, (14.4, 17.8) 
31 120 

32 Д= 2.66, C 2 2.69 x 106, P2 22.9 
33 a2 7533, b 2 1.059, y 2 17.9 
34 y? -2.15, accept 

35 7? = 12.3, significant at 5% 

36 7? = 1.35, accept Poisson 

37 7? = 12.97, accept Poisson 

39 7? = 1.30, not significant 

40 7? = 20.56, significant at 5% 

41 2 = 20.7, significant at 0.5% 


42 Y = 11.30, significant at 5% but not at 1%, for 
proportion 95%: (0.111, 0.159), 99%: (0.104, 0.166), 
significant at 1% 


43 c=4, M,(t)=4/(t- 2), E(X) = 1, Var(X) =} 
45 0.014 
46 0.995 


48 Warning 9.5, action 13.5, sample 12, UCL = 11.4, 
sample 9 


49 UK sample 28, US sample 25 

50 Action 2.93, sample 12 

51 Action 14.9, sample 19 but repeated warnings 
52 (a) sample 9, (b) sample 9 
53 (a) sample 10, (b) sample 12 
54 sample 10 

55 sample 16 


56 (a) Repeated warnings, 
(c) sample 14 


(b) sample 15, 


58 sample 11 


59 Shewhart, sample 26; cusum, sample 13; 
moving-average, sample 11 


60 0.132 

61 0.223, 0.042 

63 (а) 4, (Ы) 24, (е) 0237, 
(d) 45 min, (e) 0.276 

65 Mean costs per hour: A, £200; B, £130 

66 6 

67 Second cash desk 

68 29.496 

69 Sabotage 

70 P(C|two hits) = 0.526 

71; 

72 (а) 0.0944, (Ы) 0.81 

73 (а) 2, (b) [1+ (D'27]" 

74 AAAA 


75 1.28:1 in favour of Poisson 
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77 2.8:1 in favour of H, 
78 12.8:1 in favour of H, 


11.12 Review exercises 
1 Z=0.27, accept 

2 (0.202, 0.266) 

3 (96.1 x 10°, 104.9 x 10°) 


4 y?- 3.35 (using class intervals of length 5, with a 
single class for all values greater than 30), accept 
exponential 


5 Outlier 72 significant at 5%, outlier included 
(7.36, 11.48); excluded (7.11, 10.53) 


6 7 = 20.0, significant at 2.5% 
7 Operate if p > 5 


8 Cost per hour: A, £632.5; B, £603.4 
pa 
pat pa 





9 (a) P(input OJoutput 0) — etc. 


(b) p< a<l-p 





A 


abscissa of convergence of Laplace 
transforms 354 
AC circuits (application) 335-6 
action limits in control charts 965 
Adams-Bashforth formulae 133 
Adams-Morton formulae 140 
addition of matrices 4 
addition rule 907 
adjoint matrix 5 
algebraic multiplicity of eigenvalues 23 
aliasing error 688 
alternative solutions in linear 
programming 854 
amplified gain 464 
amplified input 462 
amplitude gain 464 
amplitude modulation (application) 
703—9 
demodulation stage 707—8 
final signal recovery 708—9 
information-carrying signal 706—7 
and transmission 705—6 
amplitude ratio 464 
amplitude spectrum 615, 648, 711 
analogue filters 545 
application 700—2 
analytic function 283 
applied probability 906 
arbitrary constant 735 
arbitrary function 735-40 
arbitrary inputs in transfer 
functions 446-9 
Argand diagram 259 
artificial variable 862 
associative law 4 
asymptotically stable system 104 
attenuated input 462 
attribute 964 


augmented matrix 9 
average power 621 


B 


base set 560 

basic variables 849 

Bayes’ theorem 986-90, 987 

applications | 988—90 
derivation 986—8 

beams, bending of 424-7 

bending of beams | 424-7 

Bernoulli, 560 

Bernoulli distribution 909 

Bessel's equality 628 

BFGS method 890 

bilateral Laplace transform 349, 658 

bilinear mappings of complex 
functions 273—9 

bilinear transform method 549 

binding constraints 848 

binomial distribution 909 

Blackman window 717-18 

block diagram algebra 429 

blood-flow model (application) 834—7 

Bode plots 466 

Boole, George 482 

boundary conditions in partial differential 
equations 826-30 

boundary-value problems in differential 
equations 161-2 

bracket 877 

bracketing procedure 876—7 

breakpoint 467 

Brigham, E.E. 693 

Broyden 890 

Butterworth approximation 700 

Butterworth filter 545 
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INDEX 


С 


canonical form of matrices, reduction to 39, 
39—53 
diagonal form 39—42 
Jordan canonical form 42-5 
quadratic forms 47—53 
canonical representation of equations 99 
capacitor microphone (application) 
107-10 
capacitors 382 
carrier signal 655 
Carslaw, H.S. 823 
Cauchy-Goursat theorem 321 
Cauchy-Riemann equations | 283-7, 292 
Cauchy's conditions in partial differentiation 
equations 826, 827 
Cauchy’s integral theorem 325 
Cauchy’s theorem 320—7 
causal functions 349 
causal sequences 483 
Cayley-Hamilton theorem 56 
central difference scheme 140 
central limit theorem 910 
proof 956-7 
chain rule 187 
Chapman, M.J. 545, 549, 702, 709 
characteristic equation 429, 510 
characteristic function 953 
characteristic polynomial 15 
characteristics in partial differential 
equations 745 
chemical processing plant 
(application) 896—8 
Chi-square distribution and test 946-8 
contingency tables 949-51 
circular frequency 562 
closed boundary 826 
coefficients of Fourier series at jump 
discontinuities 599—601 
collocation methods 164 
column rank matrix 66, 67 
column vector 3 
columns 761 
commutative law 4 
not satisfied 4 
companion form 83 
complement of event 907 
complementary function in Laplace transform 
methods 373 
complementary function of matrices 90 
complex differentiation 282—94 
Cauchy-Riemann equations 283—7, 292 
conjugate functions 288-90 
harmonic functions 288—90 
mapping 290—4 


complex form of Fourier series 609 
complex frequency 346 
complex frequency domain 348, 648 
complex functions 259-82 

bilinear mappings 273-9 

inversion 268-73 

linear mappings 261-8 

polynomial mappings 280—2 
complex integration 

fundamental theorem of 321 
complex series 295—307 

Laurent series 303-7 

power series 295—9 

Taylor series 299—302 
composite-function rule 187 
conditional distribution 929 
conditional probability 907 
confidence interval for mean 914-17, 

915 

conformal mapping 290 
conjugate functions 288-90 
conjugate-gradient methods 888 
conservative force 218, 246 
contingency tables 949-51 
continuity correction 910 
continuity equation 248 
continuous Fourier spectra 648-50 
continuous Fourier transform 684—92 
continuous random variables 908 
continuous source 821 
continuous-time systems 347 
continuous variables 907 
contour integral 317-20 
contour integration 317-34 

Cauchy’s theorem 320-7 

contour integrals 317-20, 318 

deforming 321 

evaluation of definite real integrals 

331-4 

residue theorem 328-31, 329 
control charts 964 
controllable modes in matrics 100 
convergence 

in Fourier series 584-7 
convergence rate of eigenvalues 32 
convolution 

in discrete-linear systems 524—7 

in Fourier transforms 673—5 
convolution integral 443 
convolution sum 525 
Cooley, J. W. 638, 693 
corner frequency 467 
correlation 929-33, 930 

partial 933 

rank 936-7 

and regression 943 

sample 933—5 


coupled first order equations 151—6 


Courant, Fredricks and Levy (CFL) 
condition 765 
covariance 929-33, 930 


Crank-Nicolson method for solution of 


heat-conduction/diffusion 
equation 782 


cumulative distribution function 908 


curl-free motion 209 
curl of a vector field 206—9 


current 382 


current in field-effect transistor 
(application) 338—40 


customers 975 


Cusum control charts 


D 


D’Alembert, J. de 560 


D’Alembert solution in partial differential 


equations 742-51, 745 


damped sinusoids 360 


dampers 386 
Danzig 847 

Davidon 888 
DD transform 


definite real integrals, evaluation of 331-4 


547-53 


968-71 


deflation methods in matrices 34 
deforming the contour 321 


degeneracy of a 
degree of belief 


matrix 26 
989 


degrees of freedom 919 


delay theorem 
delta function 
delta operator 


397 
413 
547 


delta operator (application)  547—53 
q (shift) operator 547 
dependent variable 259 


derivatives 


Laplace transforms of 370-1 

of scalar point function 

of vector point function 203-13 
curl of a vector field 206-9 


divergence of vector field 204—6 
vector operator 


determinants 
of mappings 


274 


of a matrix 3,5 
DFP method 888, 889-90 


diagonal matrix 
diagonalization 


3, 40, 66 
40 


difference between means 
difference equation 482 
in discrete-time systems 


solutions 


504—8 


199—202 


210—13 


921-2 


502-3 
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differential 196 
differential equations 
Laplace transform methods on 370-80 
ordinary differential equations 372-7 
simultaneous differential 
equations 378—80 
step and impulse functions 403-7 
transforms of derivatives 370—1 
transforms of integrals 371-2 
numerical solution of 
boundary-value problems 161-2 
coupled first order equations 151-6 
first order 117-51 
on engineering problems 125—7 
Euler’s method 118—24 
local and global truncation errors 
134-6 
multi-step methods 128-34 
predictor-corrector methods 
136-41 
Runge-Kutta methods 141—4 
software libraries on 149-51 
stiff equations 147-9 
functional approximation methods 
164-70 
higher order systems, state-space 
representation of 156—9 
method of shooting 162—4 
see also partial differential equations 
differentiation of Fourier series 597—8 
diffusion equation in partial differential 
equations 725, 728-31 
solution of 768—84 
Laplace transform method 772—7 
numerical solution 779-84 
separation method 768-72 
sources and sinks for 820—3 
digital filters (application) 709-15 
and windows 715-18 
digital replacement filters 546-7 
Dirac delta function 413 
direct form of state equations 89—91 
directional derivative 201 
directional field 118 
Dirichlet, L. 561 
Dirichlet’s conditions 561, 584 
for Fourier integral 641 
in partial differentiation equations 826, 
828, 829 
discrete Fourier transform 680—4 
discrete Fourier transform (DFT) pair 683 
discrete frequency spectra 615, 639 
discrete-linear systems 509-26 
convolution 524—7 
impulse response 515-18 
stability 518-24 
discrete-time Fourier transform (DTFT) 710 
discrete-time signal 482 


1026 


INDEX 


discrete-time system 347 
constructing 549-51 
design of (application) 544—7 
analogue filters 545 
digital replacement filters 546—7 
difference equations in 502-3 
discrete variables 907 
discretization of continuous-time state-space 
models 538-43 
Euler’s method 538-40 
step-invariant method 540—3, 541 
disjoint events 907 
displacement 386 
dissipative force 218 
distensibility 836 
distribution 414, 907 
of sample average 913-14 
distributive law 4 
domain of dependence 746 
domain of function 259 
domain of influence 746 
dominant eigenvalue 31 
double integrals 219-24, 220 
duality property 656 
Duhamel integral 443 
dynamic equations 84 


E 


echelon form of a matrix 9 
eigenvalues 2, 14-30, 17 
characteristic equation 15-17 
method of Faddeev 16 
and eigenvectors 17—22 
pole location 470-1 
and poles 470 
repeated 23-7 
useful properties 27-9 
eigenvectors 2, 14, 17 
electrical circuits (application) 382-6 
electrical fuse, heating of (application) 
174-8 
element of a matrix 3 
elliptic equations 824 
energy 663 
energy signals 668 
energy spectral density 664 
energy spectrum 664 
engine performance data (application) 
958-64 
mean running times and 
temperatures 959-62 
normality test 962-3 
equal matrices 3 


equality-constrained optimization 844 

equality constraints in Lagrange 
multipliers 870—4, 871 

equivalent linear systems 99 

essential singularity 308 

Euler, L. 560 

Euler's formula 563 

Euler’s method 538-40 

Euler’s method on differential 
equations 118—24, 120 

analysis 122-4 

even functions in Fourier series 
expansion 573-7 

even periodic extension 589 

events 907 

Everitt, B.S. 951 

exact differential 197 

excitation term 372 

expected value 907 

explicit formula 

for solution of heat-conduction/diffusion 

equation 779 

explicit methods in partial differential 
equations 765 

exponential distribution with parameter 976 

exponential form of Fourier series 609 

exponential modulation theorem 358 

exponential order of functions 353, 354 


F 


Faddeev method on eigenvalues 16 

faltung integral 443 

Fannin, D.R. 717 

feasible basic solution 849 

feasible region 847 

Fermat, Pierre de 871 

Feshbach, H. 830 

Fick’s law 729 

field-effect transistor (application) 338—40 

filter length 713 

filters 545 

final-value theorem 439—40 

finite calculus 482 

finite difference methods 482 

finite-difference representation 762 

finite-difference techniques 164 
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Fourier’s theorem 562-6 
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Fourier transforms 346, 638—50 
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Fourier transform pair 644-8 
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properties of 652—7 
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615 
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full-rank matrix 66, 67, 77 

functional approximation methods in 
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generalized derivatives 420 
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Laplace transform method 772—7 
numerical solution 779-84 
separation method 768-72 
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heaviside theorem 397 
Helmholtz equation 734 
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hill climbing 867, 875—95 
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implicit methods in partial differential 
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impulse functions 413-14, 415-18 
in Fourier transforms 663-75 
Laplace transforms on 403-7 
impulse invariant technique 547 
impulse response in transfer functions 436-7 
impulse sequence 486, 515 
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indefinite quadratic forms 49 
independent events 907, 929 
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multipliers 874 
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of Laplace transforms 437-8 
of z transforms 493 
inner (scalar) product 
input-output block diagram 428 
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integral solutions to partial differential 
equations 815-23 
separated solutions 815-17 
singular solutions 817-20 
integral transforms 346 
integrals, Laplace transforms of 371-2 
integration 
of Fourier series 595-7 
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double integrals 219-24 
Gauss’s divergence theorem 241—4 
Green's theorem 224—9 
line integral. 215-18 
Stokes’ theorem 244-7 
surface integrals 230—6 
volume integrals 237—40 
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inverse Laplace transform 364—5 
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and first shift theorem 367—9 
inverse Laplace transform operator 364 
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properties 6 
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inverse transform 364 
inverse z transform operator 494 
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Lagrange interpolation formula 879 
Lagrange multipliers 870—4, 871 
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inequality constraints 874 
Laplace equation in partial differential 
equations 725, 731-3 
solution of 785—801 
numerical solution | 794—801 
separated solutions 785—92 
Laplace transform 348—69 
bending of beams | 424-7 
definition and notation 348—50, 392-5 
derivative of 360-1 
on differential equations 370—80 
ordinary differential equations 372-7 
simultaneous differential 
equations 378—80 
step and impulse functions 403—7 
transforms of derivatives 370—1 
transforms of integrals 371-2 
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and Fourier transform 658-60 
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and heaviside step function 418—23 
inverse transform 364—5 
evaluation of 365-7 
and first shift theorem 358, 367—9 
kernel of 348 
limits 348-50 
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one-sided (unilateral) transform 349 
periodic functions 407-11 
pole placement (application) 470—1 
properties of 355-63 
second shift theorem 397—400 
inversion 400-3 
sifting property 414-15 
simple functions 350—3 
solution to wave equation 756—9 
state-space equations, solution of 450—61 
table of 363 
transfer functions 428—49 
and arbitrary inputs 446—9 
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definitions 428-31 
final-value theorem 439-40 
impulse response 436—7 
initial-value theorem 437—8 
stability in 431-6 
two-sided (bilateral) transform 349 
unit step function 392, 395—7 
and z transforms 529-30 
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for solution of heat-conduction/diffusion 
equation 772—7 
Laplace transform operator 348 
Laplace transform pairs 348 
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least squares in hill climbing 892—5 
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left singular vector matrix 71 
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lilinear transform 547 
limit-cycle behaviour 632 
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line spectra 615 
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linear operator 
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simplex algorithm 849-59 
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marginally stable linear system 431-2 
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reduction to canonical form 39—53 
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singular value decomposition 66—81 
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SVD 72-5 
singular values 68—72 


solution of state equation 89—102 
canonical representation 98—102 
direct form 89-91 


spectral representation of response 95-8 
transition matrix 91 
evaluating 92-4 
state-space representation 82-8 
multi-input-multi-output (MIMO) 
systems 87-8 
single-input-single-output (SISO) 
systems 82-6 
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vector spaces 10—14 
linear dependence 10—12 
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matrix 3 
maximum of objective function 853 
mean 907 
when variance unknown 918-20 
mean square error in Fourier series 627 
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mechanical vibrations (application)  386—90 
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meromorphic poles 309 
method of separation of variables 751 
Middleton, R.M. 548, 553 
minimal form 457 
modal form in matrics 96 
modal matrix 39 
modes in matrics 96 
modulation 
in Fourier transforms 655 
moment generating functions 953—7 
definition and applications 953—4 
Poisson approximation to the 


binomial 955-7 
Moore-Penrose pseudo inverse square matrix 
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Morse, P.M. 830 
motion in a viscous fluid (application) 
116-17 
moving-average control charts 971-2 
multi-input-multi-output (MIMO) systems 
in Laplace transforms 455-61 
in matrices 87-8 
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negative-definite quadratic forms 49, 51 
negative-semidefinite quadratic forms 
49, 51 
net circulation integral 218 
Neumann conditions in partial differentiation 
equations 826, 828, 829 
Newton method 884, 885 
Newton-Raphson methods 884 
Newton'slaw 386 
Nichols diagram 469 
nodes 761 
non-anticipatory systems 349 
non-basic variables 849, 854 
non-binding constraints 848 
non-conservative force 218 
non-negative eigenvalues 68 
non-square matrix 66 
non-trivial solutions of matrices 8 
nonlinear regression 943—4 
normal distribution 910-11 
normal residuals in regression 941-2 
normalizing eigenvectors 20 
nth harmonic 562 
null matrix 3 
Nyquist approach 469 
Nyquist interval 688 
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observable state of matrix 100 
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odd periodic extension 590 

offsets 440 

Ohm’s law 382 

one-dimensional heat equation 729 

one-sided Laplace transform 349 

open boundary 826 

Oppenheim, A.V. 717 
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chemical processing plant 
(application) 896-8 
heating fin (application) 898—900 
hill climbing 867, 875—95 
Lagrange multipliers 870—4 
linear programming 847-69 
order of pole 308 
order of the system 429, 510 
ordinary differential equations, Laplace 
transforms of 372-7 
orthogonal functions 624—9 
orthogonal matrix 13 
orthogonal set 624 
orthogonality relations 563 
orthonormal set 625 
oscillating systems (application) 603—6 
oscillations of a pendulum 
(application) 170—4 
over determined matrix 75, 78 
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Page, E. 983 
parabolic equations 825 
parameters 909 
estimating 912-24 
confidence interval for mean 914-17, 
915 
difference between means 921-2 
distribution of sample average 913-14 
hypothesis tests 912-13 
interval and test for proportion 922-4 
interval estimate 912-13 
mean when variance unknown |. 918—20 
testing simple hypotheses 917-18 
parasitic solutions in differential equations 
132 
Parseval's theorem 612, 614, 664 
partial correlation 933 
partial derivative 185 
partial differential equations 724 
arbitrary functions and first-order 
equations | 735-40 
boundary conditions 826-30 
finite elements | 802-14 
formal classification of 824—6 
heat-conduction or diffusion 
equation 725, 728-31 
solution of 768—84 
Laplace transform method 772—7 
numerical solution 779-84 
separation method 768-72 
sources and sinks for 820—3 


Helmholtz equation 734 
integral solutions 815-23 
separated solutions 815-17 
singular solutions 817-20 
Laplace equation 725, 731-3 
solution of 785—801 
numerical solution 794—801 
separated solutions 785—92 
Poisson equation 734 
Reynolds number 733 
Schrédinger equation 734 
wave equation 725-8 
solution of 742-67 
D’Alembert solution 742-51, 745 
Laplace transform solution 756—9 
numerical solution 761—7 
separated solutions 751—6 
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particular integral in Laplace transform 
methods 373 
Paterson, Colin 548 
path of line integral 215 
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170—4 
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periodic extension 588 
periodic functions 561-2 
phase angle 562 
phase plane 83 
phase quadrature components 563 
phase shift 464 
phase spectrum 615, 648, 711 
phases in linear programming 862—6 
point at infinity 303 
Poisson approximation to the binomial 
955—7 
Poisson distribution 909 
Poisson equation 734 
Poisson process in queues 975—7 
polar plot 469 
pole placement (application) 470-1 
pole-zero plot 429 
poles 308, 510 
and eigenvalues 470 
polynomial approximation 879 
polynomial mapping 280—2 
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population mean 913 
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positive definite function 104 
positive-definite quadratic forms 49, 51 
positive-semidefinite quadratic forms 49, 51 
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Powell 888 
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power spectrum 621-3, 622 
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principle minor of matrices 50 
principle of superposition 446 
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probability density function 908 
probability theory 906-12 
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proof 956—7 

normal distribution 910-11 

Poisson distribution 909 

random variables 907—9 

tules 907 

sample measures 911-12 
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pseudo inverse square matrix 75—81, 76 
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q (shift) operator 547 
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quadratic forms of matrices 47—53, 105 
quadratic polynomial 879 
quasi-Newton method 884 
queues 974-85 
multiple service channels queue 982-3 
Poisson process in 975-7 
problems 974-5 
simulation 983-5 
single service channel queue 978-82 
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R 
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random variables 907-9 
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rank of a matrix 9—10 
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rectangular matrix 66 
rectangular window 712 
regression 938-44, 939 
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least squares method 939-41 
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nonlinear 943-4 
normal residuals 941—2 
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regular point of f(z) 308, 309 
removable singularity 309 
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residual of equation 165 
residue theorem 328-31, 329 
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resistors 382 
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response in differential equations 372 
Reynolds number in partial differential 
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right singular vector matrix 71 
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robust methods 906 
root mean square (RMS) 614 
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rotational motion 209 
Routh-Hurwitz criterion 434 
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rows 761 
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sample space 907 
sample variance 912 
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scalar field 183, 210 
scalar Lyapunov function 104 
scalar point function 182 
derivatives of 199—202 
gradient 199—200 
scatter diagrams 933 
Schafer, R.W. 717 
Schrédinger equation 734 
Schwarzenbach, J. 442 
second shift property of z transforms 491-2 
second shift theorem 397—400 
inversion 400-3 
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of partial differential equations 815-17 
to wave equation 751-6 
separation method for solution of 
heat-conduction/diffusion 
equation 768-72 
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service discipline 975 
service time 975 
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Shanno 890 
Shewart attribute control charts 964—7 
Shewart variable control charts 967—8 
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signals 347 
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signum function 671 
similarity transform 39 
simple pole 309 
simplex algorithm 849-53, 850 
general theory 853-9 
simplex method 848 
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simulation, queues 983—5 
simultaneous differential equations, Laplace 
transform on 378-80 
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Singer, A. 835 
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in Laplace transforms 450—4 
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single multivariable searches in hill 
climbing 882—7 
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875-81 
singular points 871 


singular solutions of partial differential 
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66—81, 72-5 
pseudo inverse 75—81 
singular value matrix 68 
singularities 303, 308-11 
sinks in solution of heat-conduction/diffusion 
equation 820-3 
sinusoids, damped 360 
skew symmetric matrix 3 
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Snell's law 872 
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sources in solution of heat-conduction/ 
diffusion equation 820-3 
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spectral leakage 714 
spectral matrix 39 
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spectral representation of response of state 
equations 95-8 
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square non-singular matrix 76 
stability 
in differential equations 132 
in discrete-linear systems 518—24 
in transfer functions 431-6 
stable linear system 431 
standard deviation 908 
standard form of transfer function 466 
standard normal distribution 910 
standard tableau 853 
state equation 83 
state equation, solution of 89—102 
canonical representation 98—102 
direct form 89—91 
spectral representation of response 95-8 
transition matrix 91 
evaluating 92-4 
state feedback 470 
state-space 2,83 
state-space form 552 
state-space model 84 
state-space representation 
of higher order systems 156—9 
in Laplace transforms | 450—61 
multi-input-multi-output (MIMO) 
systems 455-61 
single-input-single-output (SISO) 
systems 450—4 
in matrices 
multi-input-multi-output (MIMO) 
systems 87-8 
single-input-single-output (SISO) 
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steady-state gain 440 
Stearns, S.J. 717 
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Laplace transforms on 403-7 
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stiffness matrix 806 

Stokes’ theorem 244-7, 245 

stream function 248 

streamline in fluid dynamics 
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method 797-8 

sum of eigenvalues 27 
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superposition principle 446 

surface integrals 230—6 
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Sylvester's conditions 50, 105 
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symmetry property 655—7, 656 
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system frequency response 661 
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Taylor series expansion 300 
Taylor theorem 884 
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thermally isotropic medium 250 
Thomas algorithm 766 
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total differential 196 


in vector calculus 196-9 
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trajectory 83 

transfer functions 428—49 


and arbitrary inputs 446-9 
convolution 443—6 
definitions 428-31 
final-value theorem 439-40 
impulse response 436-7 
initial-value theorem 437—8 

transfer matrix 456 

transformations 192, 259 
in vector calculus 192—5 
of vector spaces 12-13 

transition matrix 91 
in discrete-time state equations 533 
evaluating 92-4 

transition property 91 

translation 263, 264 
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transposed matrix 3, 4, 68 
properties 4 

Tranter, W.H. 717 

travelling waves 744 

trial function 165 

triangular window 717 

Tucker 874 

Tukey, J.W. 638, 693 

Tustin transform 547, 549 

two-dimensional heat equation 731 

two-phase method 861—9 

two phase strategy 864 

two-sided Laplace transform 349 

type I error 917 
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uncontrollable modes in matrics 100 

under determined matrix 75 

unilateral Laplace transform 349, 658 

unit impulse function 413 

unit matrix 3 

unit pulse 486 

unit step function 392, 395—7 

unitary matrix 68 

unobservable state of matrix 100 

upper control limit (UCL) in control 
charts 966 
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variable 964 
variance 907 
unknown, mean when 918-20 
variational problems 899 
vector calculus 181—256 
basic concepts 183-91 
derivatives of scalar point function 
199—202 
gradient 199—200 
derivatives of vector point function 
203-13 
curl of a vector field 206—9 
divergence of a vector field 204—6 
vector operator 210-13 
domain 182 
integration 214—47 
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